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(57) Abstract 

The invention relates to novel human DNA sequences, targeting constructs, and methods for producing novel genes encoding 
thix>mb(^ietin. DNase I, and ^-interferon by homologous recombination. The targeting constructs comprise at least (a) a targeting 
sequence; (b) a regulatory sequence; (c) an exon; and (d) a splice^donor site. The targeting constructs, which can undergo homologous 
recombination with endogenous cellular sequences to generate a novd gene, are introduced into cells to produce homologously recombinant 
cells. The homologously recombinant cells are then maintained under conditions which will permit transcription of the novel gene and 
translation of the mRNA produced, resulting in production of either thrombo po ietin. DNase I, or ^-interferon. The invention further relates 
to methods of producing pharmaoeutically useful preparations containing tfaromb^>oieiin, DNase I. or ^-interferon from homologously 
recOTibinant cells and methods of gene therapy comprising administering homologously recombinant cells producing thrombopoietin, 
DNase I, or ^-interferon to a patient for therapeutic purposes. 
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PROTEIN PRODUCTION AND DELIVERY 

Background of the Invention 

Current approaches to treating disease by administer- 
ing therapeutic proteins include in vitro production of 
5 therapeutic proteins for conventional pharmaceutical deliv- 
ery (e.g, intravenous, subcuteuieous , or intramuscular 
injection, or by intrauiasal or intratracheal aerosol admin- 
istration) and, more recently, gene therapy. 

One protein which may be useful in the treatment of 

10 platelet disorders is thrombopoietin (TPO) . Platelets are 
small (2-3 microns in diameter) ainucleated cells which play 
an important role in primary hemostasis by adhering to and 
aggregating at sites of vascular damage. In addition, 
platelets release factors which, are important con^onents of 

15 the blood coagulation, inflammation, and wound healing 
pathways. Patients with very low levels of circulating 
platelets (thrombocytopenia) exhibit bleeding into superfi- 
cial sites (e.g. skin, mucous membranes, genitourinary 
tract, and gastrointestinal tract) as a result of mild 

20 trauma, and are at risk for death from catastrophic hemor- 
rhage occurring spontaneously or resulting from trauma. 
The physiologic role of platelets and the etiology of 
platelet disorders have been described (cf . Heimttology: 
Clinical and LaJDoratory Practice, Eds. R.L. Bick et al . , 

25 pp. 1337-1389, Mosby, St, Louis (1993); Harrison's Princi- 
ples of Internal Medicine, Eds. J.D. Wilson et al . , llth 
Ed., pp. 1500-1505, McGraw Hill, New York, 1991). 

Thrombocytopenia may be caused by decreased production 
of platelets by the bone marrow, increased sequestration of 

30 platelets in the spleen, or accelerated platelet destruc- 
tion. Decreased production of platelets by the bone marrow 
may result from destruction of hematopoietic precursor 
cells by irradiation or treatment with cytotoxic agents 
during therapy for cancer. In addition, alcohol. 
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estrogens, and thiazide diuretics can suppress platelet 
production (drug-induced thrombocytopenia) . Furthermore, 
infiltration of the bone marrow by malignant cells and the 
disorders congenital amegakaryocytic hypoplasia and throm- 
5 bocytopenia with cd^sent radii (TAR syndrome) can result in 
decreased platelet production - 

Increased splenic sequestration of platelets may occur 
as a result from splenomegaly associated with a variety of 
conditions, including liver disease, infiltration of the 
10 spleen with tumor cells as in myeloproliferative or 
lymphoprolif erative disorders, and Gaucher' s disease. 

Accelerated platelet destruction and thrombocytopenia 
may be caused by vasculitis, hemolytic uremic syndrome, 
disseminated intravascular coagulation, and the presence of 
15 intravascular prosthetic devices such as cardiac valves. 
In addition, certain viral infections, drugs, and autoim- 
mune disorders lead to immunologic thrombocytopenia in 
which platelets become coated with antibody, immune com- 
plexes, or complement and are rapidly cleared from the 
20 circulation- A number of drugs can elicit an immune re- 
sponse leading to immunologic thrombocytopenia, including 
sulf athiazole, novobiocin, para-aminosalicylate , quinidine, 
quinine, carbamazepine, digitoxin, arsenical drugs, and 
methyldopa . 

25 Thrombocytopenia is currently treated most readily by 

transfusion with platelet concentrates, although cortico- 
steroid therapy or plasmapheresis can be effective in 
immunologic thrombocytopenia. Treatment with platelet 
concentrates is severely limited by availability of suit- 

3 0 able donors and the risk of transmission of blood-borne 
infectious diseases. 

As an alternative to transfusion therapy, platelet 
deficiencies could be treated with hematopoietic growth 
factors which promote proliferation and maturation of 

35 megakaryocytes, the nucleated progenitor cells from which 
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platelets are derived. Recently, cDNA clones were isolated 
which encode the human, mouse, and dog analogs of a protein 
purified from aplastic porcine plasma which displays 
megakaryocytopoietic activity (de Sauvage, F.J. et al. 
5 Nature 3^9:533-538 (1994); Lok, S. et al . Nature 359:565-5- 
68 (1994); Hartley, T.D. et al . Cell 77:1117-1124 (1994)). 
The encoded protein, termed thrombopoietin (TPO) , stimu- 
lates proliferation and maturation of megakaryocytes and 
induces platelet production in vivo upon injection into 

10 experimental animals. 

Methods for the production and delivery of other 
proteins with therapeutic properties are desirsQ^le. For 
example, it has been demonstrated that recombinant 
S-interferon is an effective medication for treatment of 

15 exacerbations in patients with relapsing-remitting multiple 
sclerosis (MS; see Kelley, C.L. and Smelt zer, S.C. J". 
Neuroscience Nursing 25:52-56 (1994)). Furthermore, it has 
been reported that IS- interferon isolated from non- 
transfected cultured human fibroblasts may be an effective 

20 means for preventing the progression of acute non-A, non-B 
hepatitis to chronic disease (Omata, M, et al . , Lancet 
33^:914-915 (1991) ) , 

As another example, it has been demonstrated that 
recombinant human DNase I is an effective agent for 

25 reducing the viscosity of sputum from cystic fibrosis (CP) 
patients (Shak, S. et al . , Proc. Natl. Acad. Sci. USA 
87:9188-9192 (1990)) and for improving pulmonary function 
and decreasing exacerbations of respiratory disease in CF 
patients (Fuchs, H.J. et al . , New Engl . a. Med. 331:637-642 

3 0 (1994) ) . It has been further suggested that DNase I may be 
effective in improving respiratory function in patients 
with other respiratory diseases, such as chronic bronchitis 
and pneumonia (Shak, S. et al . , op. cit . ) . 

While TPO, S-interferon, and DNase I are useful, for 

3 5 example, in the treatment of thrombocytopenia, MS, and CF, 
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respectively , production of therapeutic proteins using 
genetic engineering technology as taught in the prior art 
is limited to conventional recombinant DNA methods, in 
which the recombinant protein is purified from mammalian 
5 cells expressing am exogenous cloned gene or cDNA under the 
control of a suitable promoter. The exogenous DNA encoding 
the protein of interest is introduced into cells in the 
form of a viral vector, circular plasmid DNA, or linear DNA 
fragment. Chinese Hamster Ovary (CHO) cell lines and their 
10 derivatives (Gottesman, M. M. Meth, Enzymol, 251:3-8 (1987) 
or mouse cell lines, such as NSO (Galfre, G. and Milstein, 
C, Meth. Enzymol. 73(B): 3-46 (1981)) or P3X63Ag8.653 
(Kearney, J. et al. J. Iimnunol. 123: 1548-1550 (1979)) are 
commonly used, and the production of human therapeutic 
15* proteins is thus accomplished by expression and purifica- 
tion of the protein from a cell of non-human origin. 

In mcuiy cases, it is desirable to produce human 
therapeutic proteins in a human cell, for example, when it 
is desired that the glycosylation pattern of the protein be 
20 similar to patterns normally found on human cells. In 

addition, the expression of human proteins in human cells 
is important in the development of gene therapy methods, in 
which a patient's cells are engineered to produce a desired 
therapeutic protein to alleviate the symptoms or cure a 
25 disease. 

Clearly, the development of novel methods for the 
production of these human proteins in human cells would be 
of benefit to patients, through the availability of a wider 
range of products with therapeutic effectiveness. One 

3 0 approach proposed by scientists in the field for 

accomplishing this goal is to use homologous recombination, 
or gene targeting, to introduce a cloned, exogenous 
regulatory element (i.e. a promoter and/or enhancer) into a 
cell's genome at a pre-selected site such that the 

3 5 regulatory element activates expression of a nearby gene. 
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ultimately resulting in production of the protein encoded 
by that gene. This approach has been suggested in U.S. 
Patent No- 5,272,071 and in foreign patent applications 
WO 91/06666, WO 91/06667 and WO 90/11354. 



5 Summary of the Invention 

Described herein are new methods for producing TPO, 
DNase I, and S-interferon through the generation of novel 
transcription units within a cell's genome, methods which 
differ dramatically from those in the art and represent a 

10 major advance in the ability to manipulate expression in 

mammalian cells. The methods are based on the fact that an 
exogenous regulatory sequence, an exogenous exon, either 
coding or non-coding, and a splice-donor site can be 
introduced into a preselected site in the genome by 

15 homologous recombination. The resulting cells are referred 
to as targeted or homologously recombinant cells. The 
introduced DNA is positioned such that transcripts under 
the control of the exogenous regulatory region include both 
the exogenous exon and endogenous exons present in either 

20 the TPO, DNAse J, or E-interfBron genes, resulting in 

transcripts in which the exogenous and endogenous exons are 
operatively linked. The novel transcription units produced 
by homologous recombination allow TPO, DNAse I, or S-inter- 
feron to be produced in human cells using the naturally- 

25 occurring endogenous exons encoding these proteins without 
introducing any portion of the coding sequences of the 
cognate genes. The present invention further relates to 
improved materials and methods for both the 2=R vitro 
production of TPO, S-interferon, and DNase I and for the 

30 production and delivery of TPO, S-interferon, and DNase I 
by gene therapy. 

The methods of the present invention teach the 
production of TPO, S-interferon, or DNase I by gene 
activation, in which the coding DNA sequence of the 
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corresponding protein is not introduced into a cell by 
transfection of exogenous DNA encoding the protein. 
Instead, noncoding sequences upstream of one of these genes 
or coding or noncoding sequences within the genes are 
5 manipulated by gene targeting to create a novel 

transcription unit which expresses TPO, B- interferon, or 
DNase I. It is a purpose of this invention to define 
sequences upstream of the TPO, S- interferon, or DNase I 
genes, non- coding sequences (introns and 5' non- translated 
10 sequences) within the human TPO, B- interferon, or DNase I 
genes, and methods for utilizing these sequences for the 
production of TPO, S-interf eron, or DNase I. 

The methods described herein teach production of TPO, 
S-interf eron, or DNase I proteins, by the generation of 
15 novel genes in which exogenous .'and endogenous exons are 
operatively linked. As a result of introduction of 
exogenous components into the chromosomal DNA of a cell, 
the expression of the protein encoded by the endogenous 
gene is activated- Other forms of altered gene expression 
20 may be envisioned, such as increasing expression of a gene 
which is expressed in the cell as obtained, changing the 
pattern of regulation or induction such that it is 
different than occurs in the cell as obtained, and reducing 
(including eliminating) expression of a gene which is 
2 5 expressed in the cell as obtained. For example, it may be 
desirable to perform vitro protein production or gene 
therapy to produce a protein other than TPO, DNase I, or 
fi- interferon using a cell type that naturally produces one 
of these proteins. In these settings, it would be desir- 
30 able to eliminate expression of TPO, DNase I, or 
fi- interferon . 

The present invention further relates to DNA 
constructs useful in the method of activation of the TPO, 
B- interferon, or DNase I genes. The DNA constructs 
35 comprise: (a) targeting sequences; (b) a regulatory 
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sec[uence; (c) an exon; and (d) an unpaired splice-donor 
site. The targeting secjuence in the DNA construct is 
derived from chromosomal DNA lying within and/or upstream 
of the desired gene and directs the integration of elements 
5 (a) - (d) into the chromosomal DNA in a cell such that the 
elements (b) - (d) are operatively linked to sequences of 
the desired endogenous gene. In another embodiment, the 
DNA constructs comprise: (a) a targeting sequence, (b) a 
regulatory sequence, (c) an exon, (d) a splice-donor site, 

10 (e) an intron, and (f) a splice -acceptor site, wherein the 
targeting secjuence in the DNA construct is derived from 
chromosomal DNA lying within and/or upstream of the desired 
gene and directs the integration of elements (a) - (f) such 
that the elements of (b) - (f) are operatively linked to 

15 the desired endogenous gene. The targeting sequence is 
homologous to the preselected site within or upstream of 
the TPO, ^-interferon, or DNase I genes in the cellular 
chromosomal DNA with which homologous recombination is to 
occur. In the construct, the exon is generally 3' of the 

20 regulatory sequence and the splice-donor site is 3' of the 
exon. Constructs of this type are disclosed in pending 
U.S. patent applications U.S. S.N. 07/985,586 and U. S.S.N. 
08/243,3 91, all of which are incorporated herein by 
reference . 

25 The following serves to illustrate two embodiments of 

the present invention, in which the sequences upstream of 
the TPO gene are altered to allow expression of TPO in 
primary, secondary, or immortalized cells which do not 
express TPO in detectable quantities in their untransf ected 

30 state as obtained. In embodiment 1 (Figure 1) , the 

targeting construct contains two targeting sequences. Both 
the first and second targeting sequences are homologous to 
sequences upstream of the TPO coding region, with the first 
targeting sequence 5' of the second targeting sequence, 

35 The targeting construct also contains a regulatory region, 
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an exon {which in this case, comprises noncoding secjuences 
and begins at a CAP site) and an unpaired splice-donor 
site. The homologous recombination event that generates 
the novel transcription xinit producing TPO is shown in 
5 Figure 1 . 

In embodiment 2 (Figure 2) , the targeting construct 
also contains two targeting sequences. The first targeting 
sequence is homologous to sequences upstream of the 
endogenous TPO coding region, and the second targeting 

10 sequence is homologous to the second intron of the TPO 

gene . The targeting construct also contains a regulatory 
region, an exon (in this case a coding exon derived from 
the human growth hormone (hGH) gene) and an unpaired 
splice-donor site. The homologous recombination event that 

15 generates the novel transcription unit producing TPO is 
shown in Figure 2 . 

In these two embodiments, the products of the 
targeting events are novel transcription units which 
generate a mature mRNA in which an exogenous exon is 

20 positioned upstream of exon 2 (Embodiment 1) or exon 3 

(Embodiment 2) of the endogenous TPO gene. The product of 
transcription, splicing, translation, and post-transla- 
tional cleavage of the signal peptide is mature TPO. 
Embodiments 1 and 2 differ with respect to the relative 

25 positions of the regulatory sequences of the targeting 
construct that are inserted and the specific pattern of 
splicing that needs to occur to produce the final, 
processed transcript . 

The invention further relates to a method of 

30 producing TPO, S- interferon, or DNase I in vitro or in vivo 
through introduction of a construct as described above into 
host cell chromosomal DNA by homologous recombination to 
produce a homologously recombinant cell. The homologously 
recombinant cell is then maintained under conditions which 
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will permit transcription, translation and secretion of 
TPO, S- interferon, or DNase I. 

The present invention also relates to cells, such as 
homologously recombinant primary or secondary cells (i.e., 
5 non- immortalized cells) and homologously recombinant 

immortalized cells, useful for producing TPO, S- interferon, 
or DNase I, methods of making such cells, methods of using 
the cells for in vitro protein production, and methods of 
gene therapy. Homologously recombinant cells of the 

10 present invention are of vertebrate origin, particularly of 
mammalian origin, and even more particularly of human 
origin- Homologously recombinant cells produced by the 
method of the present invention contain exogenous DNA which 
causes the homologously recombinant cells to express a 

15 desired gene at a higher level or with a pattern of regula- 
tion or induction that is different than occurs in the 
corresponding cell that has not undergone homologous 
recombination. 

In one embodiment, the activated TPO, Interferon, or 

20 DNase I gene can be further amplified by the inclusion of 
an amplifiable selectable marker gene which has the 
property that cells containing amplified copies of the 
selectable marker gene can be selected for by culturing the 
cells in the presence of the appropriate selectable agent. 

25 The activated gene is amplified in tandem with the amplifi- 
able selectable marker gene. Cells containing many copies 
of the activated gene are useful for in vitro protein 
production and gene therapy. 

Homologously recombinant cells of the present 

30 invention are useful in a number of applications in humans 
and animals. In one embodiment, the cells can be implanted 
into a human or an animal for protein delivery in the human 
or animal. For example, TPO, DNase I, or fi- interferon can 
be delivered systemically or locally in humans for 

3 5 therapeutic benefit in the treatment of disease (TPO for 
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thrombocytopenia , DNase I for CF, or fi- interferon for the 
treatment of MS) . In addition, homologously recombinant 
non-human cells producing TPO, DNase I, or IS- interferon of 
non- human origin may be produced, and human or non- human 
5 cells expressing TPO, DNase I, or S- interferon may be 

enclosed within barrier devices and implanted into humans 
or animals for use in a therapy. 

Brief Description of the Drawings 

Figure 1 is a schematic diagram of a strategy for 

10 transcriptionally activating the TPO gene by the creation 
of a novel transcription unit; thick lines: targeting 
sequences; thin lines: introns and 5' upstream region; 
cross-hatched box, regulatory sequence; stippled boxes: 
noncoding exon secjuences ; black boxes : coding exon 

15 sequences; open boxes: splice sites. The splice-donor site 
(SD) of the exogenous exon in the targeting construct and 
the splice -acceptor site (SA) flanking TPO exon 2 which is 
involved in splicing to the exogenous exon are indicated. 
Figure 2 is a schematic diagram of a strategy for 

2 0 transcriptionally activating the TPO gene by the creation 
of a novel transcription unit; thick lines; targeting 
sequences; thin lines: intron 1 and 5' upstream region; 
cross-hatched box: regulatory sequence; stippled boxes: 
noncoding exon sequences; black boxes: coding exon 

25 secjuences; open boxes, splice sites. The splice-donor site 
(SD) of the exogenous exon in the targeting construct and 
the splice-acceptor site {SA) flanking TPO exon 3 which is 
involved in splicing to the exogenous exon are indicated. 
Figure 3 presents the 6,943 bp genomic Xbal fragment 

30 encompassing the 5' flanking region and exons 1, 2, and 3 
of the human thrombopoietin (TPO) gene. The Xbal fragment 
is depicted by the solid line, while exons 1, 2, and 3 are 
represented by the solid boxes. The nucleotide positions 
of the Apal, BamHI , Hindlll , EcoRI, WotI, Sfil and Xbal 
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recognition sequences are indicated- Nucleotides are 
numbered starting at the hTPO ATG initiation codon. 

Figures 4A-4D present the nucleotide sequence of 
4,4 88 bp of genomic DNA (SEQ ID NO: 3) from the human TPO 
5 locus lying 5' to the known cDNA sec[uence (de Sauvage et 
al . , op. cit*). Nucleotide numbers are noted at the 
beginning of each line. Numbering is based on the ATG 
initiation codon at position 1 (see Figures 5A-5B) . 
Ambiguities in the nucleotide sequence are represented 

10 using the following code: R = A or G (purine) ; H = A, C, or 
T; V = A, C, or G; N_= A, C, G, or T; K = G or T; S = G or 
C; W = A or T. The recognition sites for Apal, BanSll, 
Hindlll, Notl, Sfil and Xbal and their corresponding 
nucleotide positions are indicated above the sequence. 

15 Figures 5A-5B present the .'nucleotide sequence of 

2,455_bp of genomic DNA (SEQ ID NO: 4) from the human TPO 
locus extending downstream from the position of the 5' end 
of the known cDNA sequence (de Sauvage et al., pp. cit . ) - 
Nucleotide numbers are noted at the beginning of each line. 

20 Numbering is based on the ATG initiation codon at 

position 1. Shown are exon 1, intron 1, exon 2, intron 2, 
exon 3, and a portion of intron 3, Exons 1, 2, and 3 are 
underlined, and the coding portions of exons 2 and 3 are 
noted as underlined triplets. The intron-exon boundaries 

25 are deduced from the published cDNA sequence (de Sauvage et 
al., op. cit.). The recognition sites for Apal, EcoRI, and 
Xbal and their corresponding nucleotide positions are 
indicated above the sequence. 

Figure € is a schematic diagram of the strategy for 

3 0 activating the human TPO gene using targeting construct 

pTPOl as described in Example 2. The positions of the dhfr 
and neo markers, the exogenous CMV promoter and TPO exons 
1-3 are indicated. Thick lines: targeting sequences; thin 
lines: introns and 5' upstream region; cross-hatched box: 

3 5 CMV promoter; stippled boxes: noncoding exon sequences; 
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black boxes: coding exon sequences; open boxes, splice 
sites. The splice-donor site (SD) of the exogenous exon in 
the targeting construct and the splice -acceptor site (SA) 
flanking TPO exon 3 which is involved in splicing to the 
5 exogenous exon are indicated. Recognition sites for BainHI 
(B) , Notl (N) , Clal (C) , Xhol (X) , and Xbal which are 
relevant to the construction of the targeting construct are 
marked . 

Figure 7 is a schematic diagram of the strateofy for 
10 activating the human TPO gene using targeting construct 

pTP02 as described in Example 2 . The positions of the dhfr 
and neo markers, the exogenous CMV promoter and TPO exons 
1-3 are indicated. Thick lines: targeting sequences; thin 
lines: introns and 5' upstream region; cross-hatched box: 
15 CMV promoter; heavily stippled boxes: noncoding exons from 
the CMV IE gene; lightly stippled boxes: noncoding exon 
sequences of TPO exons 1 and 2 ; black boxes : coding exon 
secjuences of TPO exons 2 and 3; open boxes: splice sites. 
The splice-donor (SD) and splice -acceptor (SA) sites 
20 flanking the noncoding exons in the targeting construct and 
the splice-acceptor site (SA) flanking TPO exon 2 which is 
involved in splicing to the unpaired splice -donor site of 
the 3' exogenous exon are indicated. Recognition sites for 
BaniHI (B) , ifindlll (H) , WotI (N) , Clal (C) , Sail (S) , EcoRI 
25 (R) , and Xbal which are relevant to the construction of the 
targeting construct are marked. 

Figure 8 is a schematic diagram of the strategy for 
activating the human TPO gene using targeting construct 
pTP03 as described in Example 2. The positions of the dhfr 
3 0 and neo markers, the exogenous CMV promoter and TPO exons 
1-3 are indicated. Thick lines: targeting sequences; thin 
lines: introns and 5' upstream region; cross-hatched box: 
CMV promoter; stippled boxes: noncoding exon sequences of 
TPO exons 1 and 2; black boxes: coding exon sequences (the 
35 coding exon corresponding to hGH exon 1 in the targeting 
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construct and in the novel transcription unit is marked) ; 
open boxes: splice sites. The splice-donor site (SD) of 
the exogenous exon in the targeting construct and the 
splice -acceptor site (SA) flanking TPO exon 3 which is 
5 involved in splicing to the exogenous exon are indicated. 
Recognition sites for BajnHI (B) , Hindlll (H) , Clal (C) , 
Xhol (X) , EcoRl (R) , cuid Xbal which are relevant to the 
construction of the targeting construct are marked. 

Figure 9 is a diagrammatic representation of the 

10 approximately 8 kb Hindi fragment encompassing the 5' 
flanking region, exons 1 and 2, and the sequences down- 
stream of exon 2 of the human DNstse I gene. The Hindi 
fragment is depicted by the solid line, while exons 1 and 2 
are represented by solid rectangular boxes. The nucleotide 

15 positions of the Apal, BaihKI, Hindi, Espl , Sphl and Smal 
recognition sequences are indicated. Nucleotides are 
numbered starting at the AUG initiation codon. The 
nucleotide positions which reside upstream of exon 2 are 
based on the DNA sequence presented in Figures 10 and 11. 

20 Figures lOA-lOD present the nucleotide sequence 

encompassing 4,042 bp of DNA {SEQ ID NO: 17) from the human 
DNase I locus lying 5' to the known cDNA sequence (Shak, S. 
et al . op. cit.). Nucleotides numbers are noted at the 
beginning of each line. Numbering is based on the ATG 

2 5 initiation codon at position 1 (see Figure 11) . The 
recognition sites, and the corresponding nucleotide 
positions for Apal, BanHil, Hindi, Espl, and Sphl are 
indicated above the sequence. 

Figure 11 presents the nucleotide sequence of 810 bp 

30 of DNA (SEQ ID NO: 18) from the human DNase I locus 

extending downstream from the position of the 5' end of the 
known cDNA sequence (Shak, S. et al . op. cit.) . Shown are 
exon 1, intron 1, and a portion of exon 2. Exon l and 2 
sequences are underlined and the coding sequences are noted 

35 as underlined triplets. The positions of the putative CAP 
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site and the AUG initiation codon are indicated. The 
intron-exon boundaries are deduced from the published cDNA 
sequence (Shak S . et al . , op . cit . ) . 

Figure 12 shows a strategy for activation of the human 
5 DNase I gene by homologous recombination. The targeting 
fragment is a 4633 bp BairiHI fragment from pDNasel which 
contains; 283 bp of 5' targeting sequence from position 
-1162 (BamHI site) to -860 (Apal site) , an amplif iable dhfr 
expression \mit, neo gene, CMV IE promoter, a CAP site, a 

10 non-codon exon, an unpaired splice-donor site and 363 bp of 
3' targeting sequence from position -860 (Espl site) to 
-468 (BamHI site) - The dhfr expression xinit and the neo 
gene are depicted by open arrows, the orientation of the 
arrows represent the direction of transcription. The 

15 positions of the CMV promoter, TATA box, CAP site and 

splice donor sequence (SD) are indicated. Activation of 
the DNasB I gene is achieved by integration of the 
targeting fragment into the genome of the recipient cells 
by homologous recombination. The targeted gene product is 

20 depicted in the lower panel of the figure. The mRNA 

precursor which includes a non-coding 5' exon, a chimeric 
intron and exon 2 of the DNase gene, is represented by the 
thin arrow. 

Figure 13 is a diagrammatic representation of 9,939 bp 
25 encompassing the 5' flanking region, coding sequence and 

the 3' untranslated region of the human fi-interf eron gene. 

The 5' and 3' flanking regions are depicted by the solid 

line and the transcribed region is represented by the solid 

box. The nucleotide positions of the Bail, Bgill, EcoRI and 
3 0 PvuII recognition sequences are indicated. Nucleotides are 

numbered starting at the S- interferon ATG translational 

initiation codon (see Figure 15) . 

Figures 14A-14G present the nucleotide sequence of 

8,355 bp of DNA (SEQ ID NO: 23) from the human S-interferon 
35 locus lying 5' to the known sequence (GenBank HUMIFNBIF) . 
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Nucleotide numbers are noted at the beginning of each line. 
Numbering is based on the ATG initiation codon at position 
1 (see Figures 15) . The recognition sites for Bglll, EcdRl 
and PvuII and their corresponding nucleotide positions are 
5 indicated above the sequence. 

Figures 15A-15B present the nucleotide secjuence of 
1,584 bp of DNA (SEQ ID NO: 24) from the human 15- interferon 
locus extending downstream from the 5' end of the known 
sequence (GenBank HUMIFNBIF) . Nucleotide numbers are noted 

10 at the beginning of each line. Numbering is based on the 
ATG initiation codon at position 1. The transcribed region 
is underlined and the coding sequences are noted as under- 
lined triplets. The position of. the CAP site and AUG 
initiation codon are indicated. The recognition sites for 

15 Ball, Bglll and Pvull and their corresponding nucleotide 
positions are indicated above the sequence. 

Figure 16 depicts the strategy for activation of the 
human S- interferon gene by homologous recombination using 
targeting construct pIFNb-1 as described in Example 7. The 

20 positions of the TATA box, CAP site, dhfr and neo markers, 
the exogenous CMV promoter, and the S-interferon 5' flank- 
ing region and coding sequence are indicated. Thick lines: 
targeting sequences; thin lines: intron, S-interferon 5' 
and 3' non-coding sequences; solid box: CMV promoter; 

25 shaded box: endogenous S-interferon transcribed region; 

cross-hatched box: non-coding CMV exon 1 and the chimeric 
exon 2. The splice-donor site (SD) of the exogenous exon 
and the splice -acceptor site (SA) flanking the chimeric 
exon 2 are indicated. Recognition sites for BaniHI, £coRI , 

3 0 Hindi, Ndel and PvuII which are relevant to the 

construction of the targeting construct are marked. 



Detailed Description of the Invention 

The present invention as set forth above , relates to a 
method of expressing TPO, DNase I, or S-interferon in human 



wo 96/2941 1 PCT/DS96/03377 

-16- 

cells by activation of the endogenous TPO, DNaee J, or 
JS- interferon genes. In the present invention, homologous 
recombination is used to insert a regulatory region, an 
exon, and a splice-donor site upstream of endogenous exons 
5 coding for TPO, DNase I, or S-interf eron, generating novel 
transcription units which are active in the homologously 
recombinant cell produced. The present invention further 
relates to homologously recombinant cells produced by the 
present method and to uses of the homologously recombinant 
10 cells. In a related embodiment, an activated TPO, DNase I, 
or JS-interf eron gene is amplified subsequent to activation, 
thus allowing enhanced expression of the activated gene. 

The invention is based upon the discovery that the 
regulation or activity of endogenous genes of interest in a 
15 cell can be altered by creating a novel gene, in which the 
transcription product of the gene combines exogenous and 
endogenous exons and is under the control of an exogenous 
promoter- The method is practiced by inserting into a 
cell's genome, at a preselected site, through homologous 
20 recombination, DNA constructs comprising: (a) one or more 
targeting sequences; (b) a regulatory sequence; (c) an exon 
and (d) an unpaired splice-donor site, wherein the target- 
ing sequence or sequences are derived from chromosomal DNA 
within and/or upstream of a desired endogenous gene and 
25 directs the integration of elements (a) - (d) such that the 
elements (b) - (d) are operatively linked to the endogenous 
gene. In another embodiment, the DNA constructs comprise: 
(a) one or more targeting sequences, (b) a regulatory 
sequence, (c) an exon, (d) a splice-donor site, (e) an 
30 intron, and (f) a splice-acceptor site, wherein the target- 
ing sequence or sequences are derived from chromosomal DNA 
within and/or upstream of a desired endogenous gene and 
directs the integration of elements (a) - (f) such that the 
elements of (b) - (f) are operatively linked to the first 
35 exon of the endogenous gene. 
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The present invention relates particularly to novel 
DNA sequences that can be used in the construction of 
targeting constaructs. Non-coding genomic DNA sec[uences 
within and upstream of the transcribed regions of the TPO 
5 and DNase I genes, and upstream of the transcribed region 
of the S- interferon gene, were cloned and are described for 
the first time. These sequences or DNA fragments compris- 
ing these sequences may be used as targeting sequences in 
DNA constmcts useful for gene activation by homologous 

10 recombination. Typically, a targeting sequence is at least 
about 20 base pairs in length- The size of the sequence is 
chosen to be a size which selectively promotes homologous 
recombination with desired genomic DNA sequences. 

Analysis of the genomic DNA sequences and comparison 

15 to the known cDNA sequences revealed features essential for 
the construction of targeting constructs. For example, for 
the first time, it is shown that the first exon of the 
human TPO gene is entirely non- coding, and that translation 
initiates within the second exon of the endogenous gene . 

2 0 This information was important to the design of the gene 
activation constructs described herein, in which splicing 
of an exogenous exon to the endogenous second exon requires 
that the exogenous exon be non- coding, or in which splicing 
of an exogenous coding exon requires that targeting be 

25 performed such that the exogenous coding exon is inserted 
in a position so that it can be spliced to the endogenous 
third exon of the TPO gene. Furthermore, the cloning of 
approximately 6 . 3 kb of DNA sequence from upstream of the 
human TPO gene provided targeting sequences useful for the 

30 development of gene activation constructs. Figure 4 shows 
approximately 4.5 kb of novel DNA sequence from the human 
TPO locus lying 5' of the known cDNA secpience (de Sauvage, 
F. J. et al., op. cit.). Figure 5 shows approximately 
2.5 kb of DNA sequence from the human TPO locus extending 

35 in the 3' direction from the 5' boundary of the known cDNA 
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sequence. Intron sequences (positions -1815 to -145, 
positions 14 to 245, and positions 374 to 570) of Figure 5 
are novel, DNA constructs comprising the novel sequences 
of Figures 4 and 5, or fragments derived from these 
5 sequences, are useful for homologous recombination as 

taught herein. 

Similarly, for the first time it is shown that the 
first exon of the human DNase I gene is entirely non- 
coding. This information was important to the design of 
10 the targeting constructs described herein. Example 5, for 
exan^jle, describes a targeting construct which includes two 
non- coding exons separated by an intron, and which is 
inserted upstream of DNase I exon 1. This configuration 
allows promoter position to be optimized by varying the 
15 length of either the exogenous intron or the intron present 
between the exogenous exon and the endogenous second exon 
of the DNaee I gene, while ensuring that the primary 
transcript will be spliced appropriately and that 
translation initiates at the correct position for synthesis 
20 of functional DNase I. Furthermore, the cloning of 

approximately 4.5 kb of DNA sequence from upstream of the 
human DNase I gene provided targeting sec[uences useful for 
the development of gene activation constructs. Figure 10 
shows approximately 4 kb of novel DNA sequence from the 
25 human DNase I locus lying 5' of the known cDNA sequence 

(Shak, S- et al. op. cit . ) . Figure 11 shows approximately 
0.8 kb of DNA sequence from the human DNase I locus 
extending in the 3' direction from the 5' boundary of the 
known cDNA sequence. Intron sequences (positions -328 to 
30 -2) of Figure 11 are novel. DNA constructs comprising the 
novel sequences of Figures 10 and 11, or fragments derived 
from these sequences, are useful for homologous 
recombination as described herein. 

Finally, the analysis of the upstream region of the 
35 jS- interferon gene (a gene which is known to lack introns) 
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was cloned and sequenced and a detailed restriction map was 
produced. Previously, only 357 bp of DNA upstream of the 
translation initiation codon was characterized (see Genbank 
entry HUMIFNBIF) . The cloning and sequence analysis 
5 provided approximately 9 . 6 kb of genomic DNA upstream of 
the gene for the design and construction of a targeting 
construct (Example 7), Figure. 14 shows approximately 
8.4 kb of novel DNA sequence from the JS-interf eron locus 
lying 5' of the known sequences {Genbank entry HUMIFNBIF) . 

10 DNA constructs comprising the novel seqpaences of Figure 14, 
or fragments derived from these sequences, are useful for 
homologous recombination as taught herein-. 

The following defines the DNA constructs of the 
present invention, the elements comprising the DNA 

15 constructs of the present invention (Section A) , methods in 
which the DNA constructs are used to produce homologously 
recombinant cells (Section B) , the structure of the 
targeted gene cuid the resulting product (Section C) , the 
homologously recombinant cells produced (Section D) , uses 

20 of these cells (Sections E and F) , and the advantages of 
the constructs and methods described herein (Section G) . 

A. The DNA Construct 

The DNA constructs of the present invention include at 
least the following components: a targeting sequence; a 

25 regulatory sequence; an exon and a splice-donor site. In 
the construct, the exon is 3' of the regulatory sequence 
and the splice-donor site is 3' of the exon. In addition, 
there can be multiple exons and/or introns preceding (5' 
to) the exon flanked by the splice-donor site. Taken as a 

30 group, the exons, introns, and splice-sites are referred to 
as the "structural elements" of the construct, so-called 
because they are important in defining the structure of the 
novel gene produced by homologous recombination between 
genomic DNA and DNA of the targeting construct. As 
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described herein, there frequently are additional construct 
components, such as a selectable and/or amplifiadsle 
markers . 

The DNA in the construct is referred to as exogenous 
5 DNA, defined herein as DNA which is introduced into a cell 
by the methods described herein, such as with the DNA 
constructs of the present invention. Exogenous DNA can 
contain sequences identical to or different from the 
endogenous DNA. The term endogenous DNA is defined herein 
10 as DNA present in the cell as obtained. 

The DNA of the construct can be obtained from sources 
in which it occurs in nature or can be produced, using 
genetic engineering techniques or synthetic processes . 

1. The Targeting Sequence 

15 The targeting sequence or sequences are DNA sequences 

which permit homologous recombination into the genome of 
the selected cell containing the gene of interest. 
Targeting secjuences are, generally, DNA sequences which are 
homologous to (i.e., identical or sufficiently similar to) 

20 DNA sequences present in the genome of the cells as 

obtained (e.g., coding or noncoding DNA, located upstream 
of the transcriptional start site, within the transcribed 
region encompassing the gene, or downstream of the 
transcriptional stop site of the gene, or sequences present 

25 in the genome through a previous modification) , such that 
the targeting sequence and cellular DNA can undergo 
homologous recombination. In general, two sequences are 
described as homologous if a DNA strand of one sequence is 
capable of hybridizing to a DNA strand of the other 

3 0 sequence under conditions standardly used for the detection 
of sequence similarity (see, for example, Ausubel et al . , 
Current Protocols in Molecular Biology, Wiley, New York, 
NY. (1987)). The targeting sequence or sequences used are 
selected with reference to the site into which the DNA in 
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the DNA construct is to be inserted and may be derived from 
either genomic or cDNA sequences. Typically, a targeting 
secpaence is at least about 20 base pairs in length. The 
size of the secfuence is chosen to be a size which 
5 selectively promotes homologous recombination with desired 
genomic DNA sequences . 

One or more targeting sequences can be employed. For 
example, a circular plasmid or DNA fragment preferably 
employs a single targeting sequence. A linear plasmid or 

10 DNA fragment preferably employs two targeting seqpaences 
with exogenous DNA to be inserted into genome positioned 
between 'the two targeting sequences. The targeting 
sequence or sequences can be within an endogenous gene 
(e.g., within the sequences of an exon and/or intron) , 

15 within the endogenous promoter sequences, or upstream of 

the endogenous promoter sequences. The targeting sequence 
or sequences can include those regions of a gene presently 
known or sequenced and/or regions further upstream which 
are structurally uncharacterized but caji be mapped using 

20 restriction enzymes and cloning approaches availa±)le to one 
skilled in the art. 

2 . The Regulatory Sequence 

The regulatory sequence of the DNA construct can be 
comprised of one or more of a variety of elements, 

2 5 including: promoters (such as a constitutive or inducible 
promoters) , enhancers, scaffold- attachment regions or 
matrix attachment regions, (McKnight, R.A. et al . , Ptoc. 
Natl. Acad. Sci . USA 59:6943-6947 (1992); Phi-Van, L. and 
Stratling, W.H. EMBO J. 7:655-664 (1988)) negative 

30 regulatory elements, locus control region, (Pondel, M.D. et 
al . , Nucl. Acids Res. 20:237-243 (1992); Li, Q. and 
Stamatoyannopoulos, G. Blood 54:1399-1401 (1994)) 
transcription factor binding sites, or combinations of said 
sequences . 
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3 . Structural Elements of the DNA Construct 

a, Exons and Introns 
An exon is defined herein as a DNA sequence which is 
copied into RNA and is present in a mature mRNA molecule . 
5 An intron is defined as a sequence of one or more 

nucleotides lying between two exons and which is removed, 
by splicing, from a precursor RNA molecule in the formation 
of an mRNA molecule. * 

The DNA constructs of the present invention contain 
10 one or more exons. The exons can, optionally, contain DNA 
which encodes one or more amino acids and/or partially 
encodes an amino acid (i.e., one or two bases of a codon) . 
Where the exogenous exon or exons encode one or more amino 
acids and/or a portion of an amino acid, the DNA construct 
15 is designed such that, upon transcription and splicing, the 
reading frame is in- frame with the second or subsequent 
exon of the endogenous gene's coding region. As used 
herein, in-frame means that the encoding sequences of, for 
example, a first exon and a second exon when fused, join 

2 0 together nucleotides in a manner that does not change the 

appropriate reading frame of the portion of the mRNA 
derived from the second exon. 

In the case of activating the TPO and DNase I genes, 
the exogenous exon can, preferably, be derived from any 
25 gene in which the exon includes a CAP site and non-coding 
sequences. Examples would include the first exon of the 
CMV immediate-early gene and follicle stimulating hormone 
[FSH) gene. In the case of S-interf eron, whose gene 
contains no natural introns, there are preferably two 

3 0 exogenous non-coding exons, separated by an intron, in the 

targeting construct . 



b. Splice-Sites 
Introns contained within the mRNA of eukaryotic cells 
are removed through the recognition of signals termed 
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splice-donor and splice-acceptor sites. A splice-donor 
site is a sequence which directs the splicing of one exon 
to another exon. Typically, the first exon lies 5' of the 
second exon, and the splice-donor site overlapping and 
5 flanking the first exon on its 3' side recognizes a 

splice-acceptor site flanking the second exon on the 5' 
side of the second exon. Splice-donor sites have a 
characteristic consensus sequence represented as: 
(A/C) AGGURAGU (where R denotes a purine nucleotide) with 

10 the GU in the fourth and fifth positions being required 
(Jackson, I.J., Nucleic Acids Reaearch 19z 3715-3798 
(1991) ) . The first three bases of the splice -donor 
consensus site are the last three bases of the exon. 
Splice-donor sites are functionally defined by their 

15 ability to effect the appropriate reaction within the mRNA 
splicing pathway. 

An unpaired splice-donor site is defined herein as a 
splice-donor site which is present in a targeting construct 
and is not accompanied in the targeting construct by a 

20 splice-acceptor site positioned 3' to the unpaired 

splice-donor site. Upon homologous recombination between 
the targeting sequences and genomic DNA, the unpaired 
splice-donor site results in splicing to an endogenous 
splice-acceptor site . 

25 A splice-acceptor site is a sequence which, like a 

splice-donor site, directs the splicing of one exon to 
another exon. Acting in conjunction with a splice -donor 
site, the splicing apparatus uses a splice-acceptor site to 
effect the removal of an intron. Splice-acceptor sites 

30 have a characteristic sequence represented as: 

YYYYYYY YYYNYAG , where Y denotes any pyrimidine and N 
denotes any nucleotide (Jackson, I.J., Nucleic Acids 
Research 19:3715-3798 (1991)). 
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c. Marker Genes for Selection and Amplification 
The identification of the targeting event can be 
facilitated by the use of one or more selectable marker 
genes typically contained within the targeting DNA 
5 construct . The use of both positively and negatively 
selectable markers for identifying targeted events is 
described in related pending applications U.S. S.N. 
08/243,391, U.S. S.N. 07/985,586, U.S. S.N. 07/789,188, 
PCT/US93/11704, and PCT/US92/09627 , 
10 Homologous ly recombinant cells containing multiple 

copies of the novel transcription units produced by the 
present invention may be isolated by including within the 
targeting DNA construct an amplif iable marker gene which 
has the property that cells containing multiple copies of 
15 the selectable marker gene can be selected for by culturing 
the cells in the presence of an appropriate selectable 
agent. The novel transcription unit will be amplified in 
tandem with the amplified selectable marker gene^ allowing 
the production of very high levels of the desired protein. 
20 Amplifiable marker genes and their use are described in 

applications U.S. S.N. 08/243,391, U.S. S.N. 07/985,586, and 
PCT/US93/11704 . 

In one embodiment the positively selectable marker neo 
is used (derived from the bacterial neomycin 
25 phosphotransferase gene) is used to select for cells which 
have stably incorporated the DNA of the targeting 
construct, and the mouse dhfr {dihydrofolate reductase) 
gene is used to subsequently amplify the novel 
transcription unit present in homologously recombinant 
30 cells. 

d. Additional Elements of the Targeting 
Construct 

As taught herein, gene targeting can be used to insert 
a regulatory sequence within an endogenous gene (e.g.. 
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within the sequences of an exon and/or intron) , within the 
endogenous promoter sequences, or upstream of the 
endogenous promoter sequences, with said genes 
corresponding to the endogenous cellular TPO, B- interferon, 
5 or DNase J gene. Alternatively or additionally, the 

targeting constructs may be designed to include secjuences 
which affect the structure or stsQ^ility of the TPO, 
S- interferon, or DNase I protein or corresponding RNA 
molecule. For example, RNA stability elements, splice 

10 sites, and/or leader sequences of RNA molecules can be 
modified to improve or alter the function, stability, 
and/or translatability of an RNA molecule. Protein 
sequences may also be altered, such as signal sequences, 
active sites, and/or structural sequences for enhancing or 

15 modifying glycosylation, trsmsport, secretion, or 

functional properties of a protein. According to this 
method, introduction of the exogenous DNA results in the 
alteration of the structural or functional properties of 
the expressed proteins or RNA molecules • 

20 In one embodiment the method can be used to create 

novel transcription units encoding fusion proteins in which 
structural, enzymatic, or ligand or receptor binding 
protein domains of another protein are fused to TPO, DNase 
I, or S- interferon. In these cases the exogenous coding 

25 DNA contains an ATG translation initiation codon in- frame 
with the coding sequences of the endogenous TPO, DNase J, 
or jS- interferon gene. For example, the exogenous DNA can 
encode a sequence which can anchor TPO or DNase I to a 
membrane, a portion of a signal peptide designed to improve 

30 cellular secretion, leader sequences, enzymatic regions, 

transmembrane domain regions, co- factor binding regions, or 
other functional regions . 

The DNA construct can also include a bacterial origin 
of replication and bacterial antibiotic resistance markers 

35 or other selectable markers, which allow for large-scale 
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plasmid propagation in bacteria or any other suitable 
cloning /host system. 



B . Transfection and Homologous Recombination 

According to the present method, the construct is 
5 introduced into the cell, such as a primary, secondary, or 
immortalized cell, as a single DNA construct, or as 
separate DNA sequences which become incorporated into the 
chromosomal or nuclear DNA of a transfected cell. 

The targeting DNA construct can be introduced into 
10 cells on a single DNA construct or on separate constructs. 
The total length of the DNA construct will vary according 
to the number of components and the length of each and. the 
construct will generally be at least about 200 nucleotides. 
Further, the DNA can be introduced as linear, doxible- 
15 stranded (with or without single -stranded regions at one or 
both ends) , single -stranded, or circular DNA. 

Any of the construct types of the disclosed invention 
is then introduced into the cell to obtain a transfected 
cell. The transfected cell is maintained under conditions 
20 which permit homologous recombination, as is known in the 
art (reviewed in Capecchi, M.R., Science 244:1288-1292 
(1989) ) . When the homologously recombinant cell is 
maintained under conditions sufficient for transcription of 
the DNA, the regulatory region introduced by the targeting 
25 construct, as in the case of a promoter, will activate 
expression of the novel transcription unit produced by 
homologous recombination . 

The DNA constructs may be introduced into cells by a 
variety of physical or chemical methods, including 
30 electroporation, microinjection, microprojectile 
bombardment, calcium phosphate precipitation, and 
liposome-, polybrene-, or DEAE dextran-mediated 
transfection . 
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C. The Targeted Gene and Resulting Product 

The targeting DNA construct, when introduced by 
homologous recombination or targeting into cells containing 
the TPO, interferon, or DNase I gene, produces a novel 
5 transcription unit which results in the expression of TPO, 
S- interferon, or DNase I- 

At the targeted site in the genome, the exogenous 
regulatory sequence is operatively linked to a CAP site, 
which initiates transcription- Operatively linked is 

10 defined as a configuration in which the exogenous 
regulatory sequence, exon, splice-donor site and, 
optionally, an intron sequence and splice-acceptor site, 
are appropriately targeted at a position relative to the 
endogenous gene such that the regulatory element directs 

15 the production of a primary RNA transcript which initiates 
at a CAP site and includes sequences corresponding to the 
exogenous exon or exons and endogenous exons the TPO, DNase 
I, or B-interferon gene. In an operatively linked 
configuration the splice-donor site of the targeting 

2 0 construct directs a splicing event between an exogenous 
exon and the splice -acceptor site of an endogenous exon, 
such that a desired protein can be produced from the fully 
spliced mature transcript. In one embodiment, the 
splice -acceptor site is endogenous, such that the splicing 

2 5 event is directed to an endogenous exon of the TPO or DNase 

I gene. In another embodiment an intron and a splice- 
acceptor site are included in the targeting construct used 
to activate the S- interferon gene, and a splicing event 
removes the intron introduced by the targeting construct. 

3 0 2^ The Homologouslv Recombinant Cells 

The targeting event results in the insertion of the 
regulatory and structural sequences of the targeting 
construct into a cell's genome, creating a novel 
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transcriptional unit under the control of the exogenous 
regulatory sequences . 

Homologous recombination between the genomic DNA and 
the introduced DNA results in a homologously recombinant 
5 cell, which may be a primary, secondary, or immortalized 
human or other mammalian cell in which sequences which 
alter the expression of an endogenous gene are operatively 
linked to the endogenous TPO, DNase I, or interferon 
gene. Particularly, the invention includes a homologously 

10 recombinant cell comprising exogenous regulatory sequences 
cuid an exon, flanked by a splice-donor site, which are 
introduced at a predetermined site by a targeting DNA 
construct, and are operatively linked to the coding region 
of the endogenous gene. Optionally, there may be multiple 

15 exogenous exons (coding or non -boding) and introns 

operatively linked to any exon of the endogenous gene. The 
resulting homologously recombinant cells are cultured under 
conditions which select for amplification, if appropriate, 
of the DNA encoding the amplifiable marker and the novel 

20 transcriptional unit. With or without amplification, cells 
produced by this method can be cultured under conditions, 
as are known in the art, suitable for the expression of 
TPO, S- interferon, or DNase I. 

The targeting constructs and methods of the present 

2 5 invention may be used with, for example, primary or 

secondary cell strains (which exhibit a finite number of 
mean population doublings in culture and are not 
immortalized) and immortalized cell lines (which exhibit an 
apparently unlimited lifespan in culture) - Primary and 

3 0 secondary cells include, for example, fibroblasts, 

keratinocytes, epithelial cells (e.g., mammary epithelial 
cells, intestinal epithelial cells), endothelial cells, 
glial cells, neural cells, formed elements of the blood 
(e.g., lymphocytes, bone marrow cells), muscle cells and 
35 precursors of these somatic cell types. Where the 
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homologously recombinant cells are to be used in gene 
therapy, primary cells are preferably obtained from the 
individual to whom the resulting homologously recombinant 
cells are administered. However, primary cells can be 
5 obtained from a donor (other than the recipient) of the 
same species. Examples of immortalized human cell lines 
which may be used with the DNA constructs and methods of 
the present invention include, but are not limited to, 
HT1080 cells (ATCC CCL 121) , HeLa cells and derivatives of 

10 HeLa cells (ATCC CCL 2, 2.1 and 2.2), MCF-7 breast cancer 
cells (ATCC BTH 22), K-562 leukemia cells (ATCC CCL 243), 
KB carcinoma cells (ATCC CCL 17) , 2780AD ovarian carcinoma 
cells (Van der Blick, A.M. et al . , Cancer Res, 48:5927-5932 
(1988), Raji cells (ATCC CCL 86), WiDr colon adenocarcinoma 

15 cells (ATCC CCL 218), SW620 colon adenocarcinoma cells 

(ATCC CCL 227) , Jurkat cells (ATCC TIB 152) , Namalwa cells 
(ATCC CRL 1432) , HL-60 cells (ATCC CCL 240) , Daudi cells 
(ATCC CCL 213) , RPMI 8226 cells (ATCC CCL 155) , U-937 cells 
(ATCC CRL 1593), Bowes Melanoma cells (ATCC CRL 9607), 

20 WI-38VA13 subline 2R4 cells (ATCC CLL 75.1), and MOLT-4 
cells (ATCC CRL 1582) , as well as heterohybridoma cells 
produced by fusion of human cells and cells of another 
species. Secondary human fibroblast strains, such as WI-38 
(ATCC CCL 75) and MRC-5 (ATCC CCL 171) may be used. 

25 Further discussion of the types of cells that may be used 
in practicing the methods of the present invention is 
presented in applications U.S. S.N. 08/243,391, U.S. S.N. 
07/985,586, U.S.S.N. 07/789,188, U. S.S.N. 07/911,533, 
U.S. S.N. 07/787,840, PCT/US93/11704 , and PCT/US92/09627 . 

30 Ej. In Vivo Protein Production 

Homologously recombinant cells of the present 
invention in which the expression properties of the 
endogenous TPO, JS- interferon, or DNase I gene are altered 
are useful in gene therapy, as populations of homologously 
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recotnbinant cell lines, as populations of homologously 
recombinant primary or secondary cells, homologously 
recombinant clonal cell strains or lines, homologously 
recombinant heterogenous cell strains or lines, and as cell 
5 mixtures in which at least one representative cell of one 
of the preceding categories of homologously recombinant 
cells is present. Homologously recombinant primary cells, 
clonal cell strains or heterogenous cell strains are 
administered to an individual in whom the abnormal or 

10 undesirable condition is to be treated or prevented, in 

sufficient quantity and by an appropriate route, to express 
or make available the desired product at physiologically 
relevant levels. A physiologically relevant level is one 
which either approximates the level at which the product is 

15 normally produced in the body or results in improvement of 
the abnormal or undesirable condition. Methods for gene 
therapy in which homologously recombinant cells are 
introduced into an individual for the purpose of jja vivo 
protein production are described in pending applications 

20 U.S. S.N. 08/243,391, U.S. S.N. 07/985,586, U.S. S.N. 

07/789,188, U.S. S.N. 07/911,533, U.S.S.N., PCT/US93 /11704 , 
and PCT/US92/09627 . 

In one embodiment, the invention relates to a method 
of providing TPO to a mammal introducing homologously 

25 recombinant cells into the mammal in sufficient number to 
produce an effective amount of TPO in the mammal . 

In another embodiment homologously recombinant cells 
expressing DNase I can be administered to the trachea and 
lungs of a cystic fibrosis patient, for the purpose of in 

30 vivo secretion of DNase I for the relief of respiratory 
distress . 

In a third embodiment, homologously recombinant cells 
expressing S- interferon may be implanted into a patient 
suffering from multiple sclerosis, for the purpose of in 
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vivo secretion of S-interferon to diminish exacerbations 
associated with the disease. 



F. In Vitro Protein Production 

Homologously recombinant cells produced according to 
5 this invention can also be used for in vitro production of 
TPO, fi-interf eron, or DNase I. The cells are maintained 
under conditions, as are known in the art, which result in 
expression of the protein. Proteins expressed using the 
methods described may be purified from cell lysates or cell 

10 supeamatants . Proteins made according to this method can 
be prepared as a pharmaceutically-useful formulation and 
delivered to a human or non-human animal by conventional 
pharmaceutical routes as is known in the art (e.g., oral, 
intravenous, intramuscular, intranasal, intratracheal or 

15 subcutaneous) . As described herein, the homologously 
recombinant cells can be immortalized, primary, or 
secondary human cells. The use of cells from other species 
may be desirable in cases where the non-human cells are 
advantageous for protein production purposes where the 

20 non-human TPO, DNase I, or fi-interf eron produced is useful 
therapeutically. 

G. Advantages 

The methodologies, DNA constructs, cells, and 
resulting proteins of the invention herein possess 

25 versatility and many other advantages over processes 

currently employed within the art in gene targeting. The 
ability to activate expression of an endogenous TPO, 
1^- interferon, or DNase I gene by positioning an exogenous 
regulatory sequence and other structural sequences at 

3 0 various positions ranging from directly fused to portions 
of the normal gene's coding region to 3 0 kilobase pairs or 
further upstream of the transcribed region of an endogenous 
gene, or within an intron of an endogenous gene, is 
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advantageous for gene expression in cells. For example, it 
can be employed to position the regulatory element upstream 
or downstream of regions that normally silence or 
negatively regulate a gene. The positioning of a 
5 regulatory element upstream or downstream of such a region 
can override such dominant negative effects that normally 
inhibit transcription. In addition, regions of DNA that 
normally inhibit transcription or have an otherwise 
detrimental effect on the expression of a gene may be 
10 deleted using the targeting constructs, described herein. 
The present invention also allows proteins to be expressed 
in the context of their normal intron sequences, which have 
been shown to be important factors in the expression of 
genes in mammalian cells (cf, Korb. M. et al . WucI . Acids 
15 ReB, 21: 5901-5908 (1993)), 

Additionally, since promoter function is known to 
depend strongly on the local environment , a wide range of 
positions may be explored in order to find those local 
environments optimal for function. However, since, ATG 
20 start codons are found frequently within mammalian DNA 
(approximately one occurrence per 4 8 base pairs as 
calculated from nearest -neighbor dinucleotide frequencies 
in human DNA) , transcription cannot simply initiate at any 
position upstream of a gene and produce a transcript 
25 containing a long leader sequence preceding the correct ATG 
start codon, since the frequent occurrence of ATG codons in 
such a leader sequence will prevent translation of the 
correct gene product and render the message useless. Thus, 
the incorporation of an exogenous exon, a splice -donor 
30 site, and, optionally, an intron and a splice -acceptor site 
into targeting constructs comprising a regulatory region 
allows gene expression to be optimized by identifying the 
optimal site for regulatory region function, without the 
limitation imposed by needing to avoid inappropriate ATG 
3 5 start codons in the mRNA produced. This provides 
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signif iccmtly increased flexibility in the placement of the 
construct auid makes it possible to activate a wider range 
of genes than is possible using other technologies . For 
example, U.S, Patent No, 5,272,071 and foreign patent 
5 applications WO 91/06666, WO 91/06667 and WO 90/11354 

describe homologous recombination methods for inserting a 
regulatory sequence upstream of the coding region of an 
endogenous gene. In these methods, only a very small 
number of positions for promoter insertion are acceptable 

10 for expression, limited by the frequent occurrence of ATG 
start codons as described above . 

The present invention provides further advantages over 
the methods available in the art. For example, the use of 
homologous recombination results in the production of cells 

15 in which the novel transcription unit is present in the 
same location in all cells in which homologous 
recombination has occurred. Thus, the novel transcription 
unit will function similarly in all homologously 
recombinant cells derived independently. This allows for 

20 the production of cells with highly predictable properties. 
In the case of in vitro protein production, it is desirable 
to develop cells in which the behavior (e.g. the expression 
and amplification properties) of the desired gene can be 
controlled and there is little variation when comparing 

25 individual cells which are being processed for large-scale 
production purposes. In the case of in vivo protein 
production or gene therapy, it is desirable to be able to 
develop cells in which the properties are predictable and 
uniform among individual patients. This allows for a high 

3 0 degree of precision in achieving appropriate levels of the 
desired protein in vivo , leading to controlled and 
reproducible methods for treating disease. 

The DNA constructs described above are useful for 
operatively linking exogenous regulatory and structural 

3 5 elements to endogenous coding sequences in a way that 
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precisely creates a novel transcriptional unit, provides 
flexibility in the relative positioning of exogenous 
regulatory elements and endogenous genes and, ultimately, 
enables a highly controlled system for and regulating 
5 expression of genes of therapeutic interest. 



The subject invention will now be illustrated by the 
following examples, which are not intended to be limiting 
in any way. 

EXAMPLES 



10 EXAMPLE 1: Cloning of the TPO Gene and Identifi cation of 

5* Flanking Sequences 
The human thrombopoietin gene was isolated from a 
human genomic DNA library. The library was prepared from 
male leukocyte DNA partially-digested with Mbol and cloned 
15 into the bacteriophage vector lambda EMBL3 (Clontech, Palo 
Alto, CA; Cat. #HL1006d) . For screening, a probe was 
isolated by PCR amplification of human genomic DNA using 
oligonucleotides l.l and 1.2. 



Oligo 1.1 (TPO sense) (SEQ ID NO: 1) 



2 0 5' AATTGCTCCT CGTGGTCATG CTTCT 



Oligo 1.2 (TPO anti-sense) (SEQ ID NO: 2) 

5' CTGTGAAGGA CATGGGAGTC A 

These primers were designed using the known TPO mRNA 
sequence (de Sauvage , F. J. et al. Nature 36^9:533-538 
25 (1994)), The amplified probe (probe A; 120 bp) was labeled 
with ^^P dCTP by the polymerase chain reaction and used to 
screen the genomic DNA library. Filters were hybridized 
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for 6 hours at 68*C in 125 mM Na2HP04 (pH 7.2), 250 inM 
NaCl, 10% PEG 8000, 7% SDS, 1 mM EDTA. Filters were washed 
twice in 500 ml of 20 mM Na2HP04, (pH 7,2), 1 mM EDTA, 5% 
SDS, followed by 4 washes in 500 ml of 20 mM Na2HP04, (pH 
5 7.2), 1 mM EDTA, 1% SDS. The wash buffers were pre-heated 
to 56 'C and washing was done on a rotary shaker at room 
temperature for approximately 5 minutes per wash. The 
hybridizing signals were identified by autoradiography at 
-80 *C with an intensifying screen. In one experiment, 

10 approximately 1.4 x 10^ phage were screened and 7 positive 
signals were obtained. Phage placjues corresponding to 
positive signals were plaque purified. Following 2 rounds 
of plaque purification by low density screening using probe 
A, 4 of the phage, designated 5B, 25A, 25B and 28B, were 

15 retained for further analysis. .^Plaque purified phage were 
amplified and isolated by cesium chloride gradient 
ultracentrif ugation (Yamamoto K.R. et al . , Virology 4C?: 734 
(1970)) and DNA was isolated. Library screening, plaque 
purification of recombinant bacteriophage, and isolation 

20 bacteriophage DNA was performed using standard methods 

(Ausubel et al . , Current Protocols in Molecular Biology, 
Wiley, New York, NY. (1987)). 

An approximately 6.9 kb Xbal fragment comprising exon 
1, intron 1, exon 2, intron 2, exon 3, and a portion of 

25 intron 3, as well as approximately 4.3 kb of nontranscribed 
DNA lying upstream of TPO exon 1 was identified by 
restriction enzyme and Southern hybridization analysis 
using probe A. This fragment was isolated from one genomic 
clone (28B) and subcloned into plasmid pBSIISK"^ (Stratagene 

30 Inc., La Jolla, CA) for further analysis. The resultant 

clones, pBS{X) /5'Thromb.8 and pBS (X) /5 ' Thromb . 2 , harbor the 
6.9 kb Xbal fragment in opposite orientations with respect 
to the plasmid backbone. Restriction enzyme mapping 
yielded the restriction enzyme map shown in Figure 3 . The 

3 5 nucleotide sequence of the portion of this fragment lying 
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upstream of the 5' end of the known cDNA sequence is shown 
in Figure 4 (SEQ ID NO: 3) . The nucleotide sequence of the 
portion of the 6 . 9 kb Xbal fragment lying downstream of the 
5' end of the known cDNA sequence is shown in Figure 5 (SEQ 
5 ID NO: 4) . Comparison of the cloned genomic sequence 
presented here with the published cDNA sequence (de 
Sauvage, F. J. et al.. Nature 365:533-538 (1994)) reveals 
that the 5' end of the TPO gene consists of a non-coding 
exon (exon 1) of at least 107 bp, a second exon (exon 2) 

10 which is 158 bp, and a third exon (exon 3) which is 128 bp 
in length. The 13 base pairs at the 3' end of exon 2 code 
for the first four and a portion of the fifth amino acid of 
the TPO signal peptide. Exon 3 codes for the remainder of 
the 21 amino acid signal peptide and a portion of the 

15 mature TPO polypeptide. Exons 'l and 2 are separated by 
intron 1 (1671 bp) , and exons 2 and 3 are separated by 
intron 2 (231 bp) - There are two differences between the 
sequence reported in Figure 5 and the sequence pxiblished by 
de Sauvage et al.: nucleotides at positions -134 and -124 

20 are reported as C residues by de Sauvage et al. and are 
shown as T residues in Figure 5. These residues are 
outside of the coding sequence for TPO and may be explained 
by sequence polymorphism or by errors in compilation of the 
published sequence. In any event, this minor difference 

25 does not impact the ability of the person of skill to 
practice the invention as described herein. 

EXAMPLE 2 : Construction of Targeting Plasmids for 

Activation and Amplification of the TPO Gene 
The activation of the TPO gene can be accomplished by 
3 0 a number of strategies, as shown in Figures 6-8. In the 
strategy shown in Figure S, a targeting fragment is 
introduced into the genome of recipient cells for insertion 
of a regulatory region, a non- coding exon, and a 
functional, unpaired splice-donor site upstream of the TPO 
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coding region. Specifically, the targeting construct from 
which this fragment is derived (pRTPOl) is designed to 
include a first targeting sequence homologous to sequences 
upstream of the TPO gene, an amplifiable marker gene, a 
5 selectable marker gene, a regulatory region, a CAP site, a 
non- coding exon, an unpaired splice -donor site, and a 
second targeting sequence corresponding to sequences 
downstream of the first targeting sequence but upstream of 
TPO exon 1. By this strategy, homologously recombinant 

10 cells produce an mRNA precursor which includes the 

non- coding exon introduced upstream of the TPO gene by 
homologous recombination, the second targeting sequence and 
any sequences between the second targeting sequence and 
exon 2 of the TPO gene, and the remaining exons, introns, 

15 and 3' untranslated regions of the TPO gene (Figure 6) . 
Splicing of this message results in the fusion of the 
exogenous non- coding exon to exon 2 of the endogenous TPO 
gene which, when translated, will produce TPO, In this 

first and second targeting sequences are 

20 upstream of the normal target gene, but this is not 
required (see below) . The size of the intron in the 
targeting construct and thus the position of the regulatory 
region relative to the coding region of the gene may be 
varied to optimize the function of the regulatory region. 

25 Plasmid pRTPOl is constructed as follows: Based on the 

restriction map of the TPO upstream region (Figure 3), a 
3.5 kb BamHI fragment can be isolated from subclone 
pBS(X) /5'Thromb.8 (Example 1). This fragment is ligated to 
BaiTzHI digested plasmid pBS (Stratagene, Inc., La Jolla, CA> 

3 0 and transformed into competent E. coll cells to generate 

pBS-TPOl. This fragment includes sequences lying upstream 
of TPO exon 1. Next, a 0.73 kb fragment was amplified from 
hGH expression construct pXGH308, which has the CMV 
immediate-early (IE) gene promoter region beginning at 
35 nucleotide 546 and ending at nucleotide 2105 of Genbank 
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sequence HS5MIEP fused to the hGH sequences beginning at 
nucleotide 5225 and ending at nucleotide 7322 of Genbank 
sequence HUMGHCSA, using oligonucleotides 2,1 and 2.2. 
{The source of the CMV IE gene is not critical, and other 
5 CMV IE promoter-based plasmids may be used, or wild-type 
CMV DNA may be used.) Oligo 2.1 (37 bp, SEQ ID NO: 5) , 
hybridizes to the CMV IE promoter at -614 relative to the 
cap site (in Genbank sequence HEHCMVPl) , and includes a 
Notl site followed by a partially overlapping Xhol site at 

10 its 5' end. Oligo 2.2 (36 bp, SEQ ID NO: 6), hybridizes to 
the CMV IE promoter at +131 relative to the cap site and 
includes the first 10 base pairs of the first intron of the 
CMV IE gene and contains a Notl site at its 5' end. The 
resulting PCR fragment is digested with Notl and 

15 gel-purified. Plasmid pBS-TPOl is digested with Notl, 
which cleaves at a single site upstream of TPO exon 1 
(Figure 3), and the digested DNA is ligated to the CMV 
promoter fragment prepared above and transformed into 
competent F. coli cells. Colonies containing inserts of 

20 the CMV promoter inserted at the Notl site of pBS-TPOl are 
analyzed by restriction enzyme analysis to confirm the 
orientation of the insert, and one recombinant plasmid in 
which the CMV promoter is oriented such that the direction 
of transcription is towards TPO exon 1 is identified and 

25 designated pBS-TP02. 

Oligo 2.1 (SEQ ID NO: 5) 

5, TTTT GCGGCC GCTCGAG GAC ATTGATTATT GACTAGT 

Notl Xhol 

Oligo 2.2 (SEQ ID NO : 6) 

3 0 5' TTTT GCGGCC GC CGGTACTT ACGTCACTCT TGGCAC 

Notl 
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Next, the neomycin phosphotransferase {neo) gene is 
inserted into pBS-TP02 for use as a selectable marker in 
isolating stably transfected human cells. Plasmid 
pMClneoPolyA [Thomas, K.R. and Capecchi, M.R. Cell 
5 51:503-512 (1987); available from Stratagene Inc., La 

Jolla, CA] is digested with BainHI and made blunt -ended by 
treatment with the Klenow fragment of E, coli DNA 
polymerase. The treated DNA is then ligated to a 
double -stranded 10 base pair Clal linker of the sequence 

10 5'GGATCGATCC, chosen such that the BairiHI site is not 

regenerated by the linker addition. The resulting DNA is 
digested with Clal and the digested DNA is ligated under 
dilute conditions to promote recircularization and 
transformed into competent E. coli cells. Transformed 

15 colonies are analyzed by restriction enzyme digestion to 
identify cells containing a derivative of plasmid 
pMClneoPolyA with an insertion of a Clal site at the 3' end 
of the neo gene. This plasmid is designated pMClneo-C. 
pMClneo-C is digested with Xhol and Sail and the 

2 0 approximately 1.1 kb fragment containing the neo 

expression unit is gel purified. Plasmid pBS-TP02 is 
digested at the unique Xhol site which was introduced by 
PGR at the 5' end of the CMV promoter, and the digested DNA 
is ligated to the purified Xhol-Sall fragment containing 
25 the neo gene and transformed into competent E. coli cells. 
Colonies containing inserts of the neo gene inserted at the 
Xhol site of pBS-TP02 are analyzed by restriction enzyme 
analysis to confirm the orientation of the insert, and one 
recombinant plasmid in which the neo gene is oriented such 

3 0 that the direction of transcription is opposite to CMV is 

identified and designated pBS-TP03 . 

Finally, the targeting construct pTPOl is constructed 
by insertion of a dhfr expression unit (to select for 
amplification in targeted human cells) at the Clal site 
35 located at the 5' end of the neo gene of pBS-TP03 . To 
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obtain a dhfr expression unit, the plasmid construct 
pF8CIS9080 [Eaton et al., Biochemistry 25: 8343-8347 
(1986)] is digested with EcoRl and Sail. A 2 kb fragment 
containing the dhfr expression unit is purified from this 
5 digest and made blunt by treatment with the Klenow fragment 
of DNA polymerase I. A Clal linker (New England Biolabs, 
Beverly, MA) is then ligated to the blunted dhfr fragment. 
The products of this ligation are digested with Clal 
ligated to Clal digested pBS-TP03. An alicjuot of this 
10 ligation is transformed into E, coli and plated on 
ampicillin selection plates. Bacterial colonies are 
analyzed by restriction enzyme digestion to determine the 
orientation of the inserted dhfr fragment. One plasmid 
with dhfr in a transcriptional orientation opposite that of 
15 the Jieo gene is designated pRTPOl. For targeting to the 

TPO locus in cultured human cells, pRTPOl is digested with 
BairiHI to separate the targeting fragment containing the 
targeting DNA, neo gene, dhfr gene, CMV promoter, and 
splice-donor site from the pBS plasmid backbone. 
20 A second strategy for activation of the TPO gene is 

shown in Figure 7. In this strategy, a targeting fragment 
is introduced into the genome of recipient cells for 
insertion of a regulatory region, a non-coding exon, a 
splice-donor site, an intron, a splice-acceptor site, a 
25 second non-coding exon, and a functional, unpaired 
splice-donor site upstream of the TPO coding region. 
Specifically, the targeting construct from which this 
fragment is derived (pRTP02) is designed to include a first 
targeting sequence homologous to sequences upstream of the 
3 0 TPO gene, an amplifiable marker gene, a selectable marker 
gene, a regulatory region, a CAP site, a non- coding exon, a 
splice-donor site, an intron, a splice-acceptor site, a 
second non-coding exon, an unpaired splice-donor site, and 
a second targeting sequence corresponding to sequences 
3 5 downstream of the first targeting sequence but upstream of 
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TPO exon 2. By this strategy, homologously recombinant 
cells produce an mRNA. precursor which corresponds to the 
first and second non- coding exogenous exons separated by an 
intron, the second targeting sequence, any sequences 
5 between the second targeting sequence and exon 2 of the TPO 
gene, and the remaining exons, introns, and 3' untranslated 
regions of the TPO gene (Figure 7) . Splicing of this 
message results in the fusion of the second non-coding 
exogenous exon to exon 2 of the endogenous TPO gene which, 

10 when translated, will produce TPO. In this strategy the 
first and second targeting sequences are upstream of the 
normal target gene, but this is not required (see below) . 
The size of the intron in the targeting construct and thus 
the position of the regulatory region relative to the 

15 coding region of the gene may tie varied to optimize the 
function of the regulatory region. 

Plasmid pRTP02 is constructed as follows: Based on 
the restriction map of the TPO upstream region (Figure 3) , 
a 1.8 kb BajriHI-EcoRI fragment can be isolated from subclone 

20 pBS (X) /5' Thromb. 8 (Example 1). This fragment is ligated to 
BajriHI and BcoRI digested plasmid pBS (Stratagene, Inc., La 
Jolla, CA) and transformed into competent E. coli cells to 
generate pBS-TP04. This fragment includes TPO exon 1 but 
contains no TPO coding sequences . 

25 Next, oligonucleotides 2.3 to 2.6 are used in PGR to 

fuse CMV IE promoter sequences beginning at nucleotide 54 6 
and ending at nucleotide 2105 of Genbank sequence HS5MIEP 
to sequences from the TPO gene comprised of exon 1 and a 
portion of intron 1. The properties of these primers are 

30 as follows: 2.3 (SEQ ID NO: 7) is a 30 base 

oligonucleotide homologous to a segment of the CMV IE 
promoter beginning at nucleotide 546 of Genbank sequence 
HS5MIEP (-614 relative to the cap site) and includes a Xhol 
site at its 5' end; 2.4 (SEQ ID NO: 8) and 2.5 (SEQ ID NO: 

35 9) are 60 nucleotide complementary primers which define the 
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fusion of CMV (position 2100 of Genbank seq[uence HS5MIEP) 
and TPO (position -1881 relative to the TPO translation 
start site) sequences; 2.6 (SEQ ID NO: 10) is 27 
nucleotides in length and is homologous to TPO sequences 
5 ending in TPO intron 1 at position -1374 relative to the 
TPO translation start site and includes a natural Apal 
site . 

Oligo 2.3 (SEQ ID NO: 7) 

5' TTTT CTCGAG GACATTGATT ATTGACTAGT 
10 Xhol 

Oligo 2.4 (SEQ ID NO: 8) 

5' catgggtctt ttctgcagtc accgtccttg CTACCCATCT GCTCCCCAGA 
GGGCTGCCTG 

Oligo 2.5 (SEQ ID NO: 9) 

15 5' CAGGCAGCCC TCTGGGGAGC AGATGGGTAG caaggacggt gactgcagaa 
aagacccatg 

Oligo 2.6 (SEQ ID NO: 10) 

5, TTTT GGGCCC TCCTCCCATT ACCCTCT 
Apal 

20 Oligos 2.3-2.6: Bases in lower-case type denote CMV 

sequences; bases in upper-case type denote TPO sequences 

These primers are used to amplify a 2.1 kb DNA 
fragment comprising a fusion of CMV IE and TPO sequences. 
The fusion fragment is created by first using oligos 2.3 
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and 2,4 to amplify a 1.6 kb fragment from hGH expression 
construct pXGH308, which has the CMV immediate- early (IE) 
gene promoter region beginning at nucleotide 546 and ending 
at nucleotide 2105 of Genbank sequence HS5MIEP fused to the 
5 hGH sequences beginning at nucleotide 5225 and ending at 
nucleotide 7322 of Genbank sequence HUMGHCSA* (The source 
of the CMV IE gene is not critical, and other CMV IE 
promoter-based plasmids may be used, or wild-type CMV DNA 
may be used.) Then, oligos 2.5 and 2.6 are used to amplify 
10 a 0.54 kb fragment containing portions of TPO exon 1 and 
TPO intron 1 from plasmid pBS(X) /5'Thromb.8 (Example 1). 
The two amplified fragments are then combined and further 
amplified using oligos 2.3 and 2.6. The resulting product, 
a 2,1 kb PCR fragment is digested with Xhol and Apal and 
15 gel purified. Plasmid pMCneo-C (see above) is digested 
with Sail and Xhol and the 1.1 kb neo containing fragment 
is gel purified. The purified 2.1 kb PCR fragment and the 
1,1 kb neo fragment are then mixed and ligated to pBS-TP04 
(above) which has been cut with Sail and Apal. The 
20 ligation mixture is transformed into B. coli cells cmd a 
plasmid with a single insert of each the fusion fragment 
and the neo gene is identified, this plasmid having the 
Sail site at the 3' end of the neo gene regenerated by 
ligation to the Sail site in the polylinker of pBS-TP04. 
25 The resulting plasmid is designated pBS-TP05, 

A dhfr expression unit (to select for amplification in 
targeted human cells) is then inserted at the Clal site 
located at the 5' end of the neo gene of pBS-TP05- The 
dhfr expression unit is isolated from plasmid pF8CIS9080 
30 [Eaton et al • , Biochemistry 25: 8343-8347 (1986)] by 

digestion with EcoRI and Sail. A 2 kb fragment containing 
the dhfr expression unit is purified from this digest and 
made blunt by treatment with the Klenow fragment of DNA 
polymerase I. A Clal linker (New England Biolabs, Beverly, 
35 MA) is then ligated to the blunted dhfr fragment. The 
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products of this ligation are digested with Clal ligated to 
Clal digested pBS-TP05. An aliquot of this ligation is 
transformed into E. coli and plated on ampicillin selection 
plates. Bacterial colonies are amalyzed by restriction 
5 enzyme digestion to detearmine the orientation of the 
inserted dhfr fragment . One plasmid with dhf r in a 
transcriptional orientation opposite that of the neo gene 
is designated pBS-TP06 . 

To complete plasmid pRTP02, plasmid pBS (X) /5'Thromb. 8 

10 (Example 1) is partially digested with BairiHI and ligated to 
a Sail linker. The resulting DNA is then digested with 
5all and Hindlll and the 3.7 kb fragment consisting of 
sequences upstream of the TPO gene is isolated for use as a 
second targeting sequence. This fragment is ligated to 

15 Hindu I -Sail digested pBS-TP06 to generate the targeting 

plasmid pRTP02 . For targeting to the TPO locus in cultured 
human cells, pRTP02 is digested with Hindlll and EcoRI to 
separate the targeting fragment containing the targeting 
DNA, neo gene, dhJfr gene, and CMV promoter from the pBS 

20 plasmid backbone. 

A third strategy for activation of the TPO gene is 
shown in Figure 8. In this strategy, a targeting fragment 
is introduced into the genome of recipient cells for 
replacement of the normal TPO regulatory region, TPO exon 

2 5 1, TPO intron 1, and TPO exon 2 with an exogenous 

regulatory region, a coding exon, and a functional, 
unpaired splice-donor site. Specifically, the targeting 
construct from which this fragment is derived (pRTP03) is 
designed to include a first targeting sequence homologous 

3 0 to sequences upstream of the TPO gene, an amplifiable 

marker gene, a selectable marker gene, a regulatory region, 
a CAP site, an exon which includes sequences coding for the 
first 3 1/3 amino acids of the human growth hormone (hGH) 
signal peptide, an unpaired splice -donor site, and a second 
3 5 targeting sequence corresponding to TPO intron 2 sequences. 
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By this strategy, homologously recombinant cells produce an 
mRNA precursor which corresponds to the exogenous coding 
exon, intron 2 of the TPO gene, exon 3 of the TPO gene, and 
the remaining exons, introns, and 3' untranslated regions 
5 of the TPO gene (Figure 8) . Splicing of this message 

results in the fusion of the exogenous coding exon to exon 
3 of the endogenous TPO gene which, when translated, will 
produce a fusion protein in which the first 3 amino acids 
of the signal peptide are derived from hGH. The signal 

10 peptide of this molecule is cleaved off prior to secretion 
from a cell to produce mature TPO. In this strategy the 
first targeting sequence is upstream of the normal target 
gene, while the second targeting sequence is within the 
gene, between exons 2 and 3. The position of the first 

15 targeting sequence and the amoiznt of upstream DNA replaced 
or deleted by the targeting event may be varied to optimize 
the function of the regulatory region. 

Plasmid pRTP03 is constructed as follows: 
Oligonucleotides 2.8 to 2.11 are used in PGR to fuse CMV IE 

20 promoter sequences beginning at nucleotide 546 and ending 
at nucleotide 1258 of Genbank sequence HS5MIEP to sequences 
from the human growth hormone gene which encode the first 3 
1/3 amino acids of the hGH signal peptide, a splice donor 
site, and the second intron of the TPO gene. The 

25 properties of these primers are as follows: Oligo 2.8 (SEQ 
ID NO: 11) is a 30 base oligonucleotide homologous to a 
segment of the CMV IE promoter beginning at nucleotide 546 
of Genbank sequence HS5MIEP (-614 relative to the cap site) 
and includes an Xhol site at its 5' end; 2.9 (SEQ ID NO: 

30 12) and 2.10 (SEQ ID NO: 13) are 69 nucleotide 

complementary primers which define the fusion of CMV 
(position 2100 of Genbank sequence HS5MIEP) and hGH 
sequences (position -10 relative to the translation start 
site of the hGH gene; see the hGH gene N sequence in 

35 Genbank entry HUMGHCSA) sequences. These primers also 
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include the first 29 base pairs of TPO intron 2 
(nucleotides +14 to +42 relative to the TPO translation 
start site), which include the splice donor site; 2.11 (SEQ 
ID NO: 14) is 4 5 nucleotides in length and is homologous to 
5 TPO sequences in TPO intron 2 starting at position +182 
relative to the TPO translation start site and extending 
upstream, and includes a natural EcoRl site at its 5' end. 

The fusion fragment is created by first using oligos 
2.8 and 2.9 to amplify a 0.7 kb fragment from CMV viral DNA 
10 containing a wild- type immediate early gene and promoter 
sequence. (The source of the CMV IE gene is not critical, 
and other CMV IE promoter- based plasmids may be used.) 
Then, oligos 2.10 and 2.11 are used to amplify a 0.17 kb 
fragment containing a portion of TPO intron 2 from plasmid 
15 pBS(X) /5'Thromb.8 (Example 1) . ^ The two amplified fragments 
are then combined and further amplified using oligos 2.8 
and 2.11, The resulting product, a 0 . 9 kb PCR fragment is 
digested with Xhol and JBcoRI and gel purified. Next, 
plasmid a pBS (X) /5'Thromb.8 (Example 1) is partially 
20 digested with BamHI and ligated to an Xhol linker. The 
resulting DNA is then digested with Xhol and Hindlll and 
the 3.9 kb fragment consisting of sequences upstream of the 
TPO gene is isolated for use as a second targeting 
sequence. This fragment contains sequences from -5985 to 
25 -2095 relative to the TPO translation start site (Figure 
3) . The isolated fragment is then ligated in a mixture 
containing the 0 . 9 kb fusion fragment purified above and 
Hindlll and EcoRl digested plasmid pBS {Stratagene, Inc., 
La Jolla, CA) and transformed into competent E. coli cells 
30 to generate pBS-TP07. 

For insertion of the neo selectable marker gene, 
plasmid pMClneo-C (see above) is digested with Xhol and 
Sail and ligated to Xhol digested pBS-TP07. The ligation 
mix is transformed into E. coli cells and colonies are 
3 5 analyzed by restriction enzyme analysis to identify a 
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plasmid with a single insert of the neo gene oriented such 
that the direction of transcription is opposite to that of 
the CMV promoter. This plasmid is designated pBS-TP08. 

A dhfr expression unit (to select for amplification in 
5 targeted human cells) is then inserted at the Clal site 
located at the 5' end of the neo gene of pBS-TP08. The 
dhfr expression unit is isolated from plasmid pF8CIS9080 
[Eaton et al.. Biochemistry 25: 8343-8347 (1986)] by- 
digestion with EcoRI and Sail . A 2 kb fragment containing 

10 the dhfr expression unit is purified from this digest and 
made blunt by treatment with the Klenow fragment of DNA 
polymerase I. A Clal linker (New England Biolabs, Beverly, 
MA) is then ligated to the blunted dhfr fragment. The 
products of this ligation are digested with Clal ligated to 

15 Clal digested pBS-TP08. An aliquot of this ligation is 

transformed into E. coli and plated on ampicillin selection 
plates. Bacterial colonies are analyzed by restriction 
enzyme digestion to determine the orientation of the 
inserted dhfr fragment. One plasmid with dhfr in a 

20 transcriptional orientation opposite that of the neo gene 
is designated pRTP03 . For targeting to the TPO locus in 
cultured human cells, pRTP03 is digested with EcoRI and 
Hindlll to separate the targeting fragment containing the 
targeting DNA, neo gene, dhfr gene, CMV promoter, and hGH 

2 5 coding DNA from the pBS plasmid backbone. 

Oligo 2.8 (SEQ ID NO: 11) 

5r TTTTCTCGAG GACATTGATT ATTGACTAGT 
Xhol 



Oligo 2.9 (SEQ ID NO: 12) 
3 0 5' cgcggattcc ccgtgccaag CCTAGCGGCA ATGGCTACAG GTGAGAACAC 
ACCTGAGGGG CTAGGGCCA 
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Oligo 2.10 (SEQ ID NO: 13) 

5' TGGCCCTAGC CCCTCAGGTG TGTTCTCACC TGTA(»CCATT GCCGCTAGGc 
ttggcacggg gaatccgcg 

Oligo 2.11 (SEQ ID NO: 14) 
5 5' TTTT GAATTC CCATTCAGGA CCCAGACCTG AAACCCAGGG AATCC 

EcoRl 



Oligos 2,8-2.11: Bases in lower-case type denote CMV 
sequences; upper-case, non-bold bases denote TPO sequences; 
boldface bases denote hGH exon 1 sequences . 

10 Other approaches for targeting and activation of the 

TPO gene may be employed. For example, the first and 
second targeting sequences may correspond to sequences in 
the first or second intron of the TPO gene, and the 
targeting sequences may include TPO coding sequences . In 
15 any activation strategy, the second targeting sequence does 
not need to lie immediately adjacent to or near the first 
targeting sequence in the normal gene, such that portions 
of the gene's normal upstream region are deleted upon 
homologous recombination. Furthermore, one targeting 
20 sequence may be upstream of the gene and one may be within 
an exon or intron of the TPO gene . 

A selectable marker gene is optional and the 
amplifiable marker gene is only required when amplification 
is desired. The amplifiable marker gene and selectable 
25 marker gene may be the same gene, their positions may be 

reversed, and one or both may be situated in the intron of 
the targeting construct. Amplifiable marker genes and 
selectable marker genes suitable for selection are 
described herein. The incorporation of a specific CAP site 
3 0 is optional. The regulatory region, CAP site, first 
non-coding exon, splice-donor site, intron, second 
non- coding exon, and splice acceptor site may be isolated 
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as a coirplete unit from the human elongation factor- la 
(EF-la; Genbank sequence HUMEFIA) gene or the 
cytomegalovirus (CMV; Genbank secjuence HEHCMVPl) immediate 
early region, or the components can be assembled from 
5 appropriate components isolated from different genes. In 
any case, either exogenous exon may be the same or 
different from the first exon of the normal TPO gene, and 
multiple non- coding exons may be present in the targeting 
construct . 

10 As described herein, a number of selectable and 

amplifiable markers may be used in the targeting 
constructs, and the activation may be effected in a large 
number of cell-types. 

EXAMPLE 3 ; In Vitro Production of TPO bv Activation and 

15 Amplification of the TPO Gene in an 

Immortalized Cell Line 
Transfection of primary, secondary, or immortalized 
human cells and isolation of homologous ly recombinant cells 
expressing TPO may be accomplished using the methods 

20 described in U.S. Serial No. 08/243,391 incorporated by 
reference, Homologously recombinant cells may be 
identified by PGR screening strategy as exemplified therein 
and in published methods available to one skilled in the 
art (see, for example, Kim, H-S and Smithies, O., NucI . 

25 Acids Res. 15:8887-8903 (1988)). The identification of 
cells expressing TPO may also be accomplished using a 
variety of assays based on the structure or properties of 
TPO. For example, TPO may be functionally identified by an 
in vitro or in vivo megakaryocytopoiesis assay (de Sauvage 

30 et al., Mature 359:533-538 (1994)). Alternatively, TPO may 
be assayed by the stimulation of proliferation of cells 
expressing the c-mpl ligand, the receptor for TPO, In this 
assay, cells such as Ba/F3-mpl cells (de Sauvage et al , , 
Nature 355:533-538 (1994)), are exposed to TPO and cell 
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prolif eration is monitored by ^H-thymidine uptake. TPO may 
also be assayed through its effects on in vivo platelet 
production, either by direct platelet counts or by 
incorporation of ^^S into platelets. Finally, peptides 
5 corresponding to portions of the TPO molecule may be 

synthesized in order to generate ant i -TPO antibodies for 
use in an ELISA assay. 

The isolation of cells containing amplified copies of 
the amplifiable marker gene and the activated TPO locus is 
10 performed as described in U.S. Serial No.: 07/985,586 
incorporated by reference. 

EXAMPLE 4 : Cloning of the Human DJtose I Gene and 

Tdentif ication of the 5* Flanking Sequences 

The human DN^tse I gene was isolated from a human 
15 genomic DNA library. The library (Clontech, Palo Alto, CA; 
Cat. #HL1006d) was constructed by cloning Wbol partially 
digested male leukocyte DNA into the SairiHI site of the 
bacteriophage lambda vector EMBIi3 . For library screening, 
a DNA probe was isolated by PCR amplification of human 
20 genomic DNA using oligonucleotides 4.1 and 4.2. 

Oligo 4.1 (SEQ ID NO: 15) 
5' TGCCTTGAAG TGCTTCTTCA 
Oligo 4.2 (SEQ ID NO: 16) 
5' CCTCAGAGAT GACGAGAATG C 

2 5 These primers were designed based on the published 

DNase I mRNA sequence (Shak S. et al . , Proc. Natl. Acad. 
Sci. USA 87:9188-9192 (1990)). The amplified probe (probe 
A; 126 bp) was labeled with ^^p.^CTP by PCR and used to 
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screen a bacteriophage lambda genomic DNA library. The 
filters were hybridized for 16 hours at 68 *C in 125 mM 
NajHPO^ (pH 7.2), 250 mM NaCl, 10% PEG 8000, 7% SDS, 1 mM 
EDTA. Filters were washed two times in 50O ml of 20 iriM 
5 Na2HP04 (pH 7.2), 5% SDS, 1 mM EDTA, followed by 4 washes 
in 500 ml of 20 mM Na2HP04 (pH 7.2), 1% SDS, 1 mM EDTA. 
The wash buffers were preheated to 56 *C and washing was 
performed at room temperature on a rotary shaker for 
approximately 5 minutes per wash. The hybridization 

10 signals were visualized by autoradiography at -80 'C with an 
intensifying screen. In this experiment, approximately 1 x 
10^ phage were screened and 18 positive signals were 
obtained. Bacteriophage plagues corresponding to 10 of the 
positive signals were plated at low density and subjected 

15 to a second round of screening *using probe A. Four of the 
phage (designated 2a, 3b, 4c and 14a) gave positive 
hybridization signals following the secondary screening and 
were retained for further analysis. DNA was isolated from 
the plaque purified phage following amplification and 

20 subsequent purification by cesium chloride gradient ultra 
centrif ugation (Yamamoto, K.R. et al.. Virology 40:734 
(1970) ) . Library screening, plaque purification of 
recombinant bacteriophage and isolation of bacteriophage 
DNA was performed using standard methods (Ausubel et al . , 

25 Current Protocols in Molecular Biology. Wiley, New York, NY 
(1987) ) . 

Based on restriction enzyme digestion and Southern 
blot analysis using probe A, two of the phage (4c and 14a) 
contain a common Hindi fragment of approximately 8 kb 

3 0 which encompasses exon 1, intron 1, exon 2, coding and 
non- coding sequences corresponding to intron 2 and 
downstream DNase I exons, as well as approximately 4 kb of 
non- transcribed DNA lying upstream of DNase I exon I. This 
fragment was isolated from one genomic clone (4c) and 

35 subcloned into pBSIISK* (Stratagene Inc., La Jolla, CA) for 
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further analysis. Restriction enzyme mapping of the 
resultant clone, pBS/ 4C,2Hinc2, was used to generate the 
restriction map shown in Figure 9. The nucleotide sequence 
of the non- transcribed DNase I 5' region lying upstream of 
5 the 5' end of the known cDNA sequence is shown in Figure 10 
(SEQ ID NO: 17) . The nucleotide sequence lying downstream 
of the 5' end of the known cDNA sequence, including exon 1, 
intron 1 and part of exon 2 is shown in Figure 11 (SEQ ID 
NO: 18) . Comparison of the cloned genomic sequence 

10 presented here, with the published cDNA sequence (Shak, S. 
et al., Proc. Natl. Acad. Sci. USA ©7: 9188-9192 (1990)) 
reveals that the 5' end of the DNase I gene consists of a 
non-coding exon {exon 1) of 142 bp and a second exon {exon 
2) which is at least 341 bp, Exon 2 encodes a 22 amino 

15 acid signal sequence and a portion of the mature DNase I 
peptide, beginning with an AUG translational initiation 
codon which lies 1 bp downstream of the 5' end of exon 2. 
Exons 1 and 2 are separated by intron 1 which is 336 bp in 
length, 

20 EXAMPLE 5: Construction of Targeting Plasmids for 

Activation and Amplification of the DNase I 

Gene 

The activation of the DNase I gene can be accomplished 
by the stratec^y outlined in Figure 12, In this stratecfy, a 

25 targeting fragment is introduced into the genome of 

recipient cells for insertion of a regulatory region, a 
non-coding exon and a functional unpaired splice-donor site 
upstream of the DNase I coding region. Specifically, the 
targeting construct from which this fragment is derived 

3 0 (pDNasel) , is designed to include a 5' targeting sequence 
homologous to sequences upstream of the DNase I gene, a 
selectable marker gene, an amplifiable marker gene, a 
regulatory region, a CAP site, a non -coding exon, an 
unpaired splice-donor site, and a 3' targeting sequence 
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corresponding to sequences downstream of the 5' targeting 
sequence but upstream of DNase I exon 1 , According to this 
strategy, integration of the targeting construct by 
homologous recombination generates recombinant cells 
5 producing an mRNA precursor which includes the non-coding 
exon introduced upstream of the DNase I gene, the 3' 
targeting sequence, any sequences between the 3' targeting 
sequence and exon 2 of the DNase I gene, and the remaining 
exons, introns coid 3' iintrcoislated regions of the DNase I 

10 gene (Figure 12) . Splicing of this transcript results in 
the fusion of the exogenous non- coding exon to exon 2 of 
the endogenous DNase I gene, DNase I is produced by 
translation of the mature mRNA. According to this 
strategy, both the 5' and 3' targeting sequences are 

15 upstream of the endogenous target gene. The size of the 
chimeric intron in the targeting construct, which is 
dictated by the position of the regulatory region relative 
to the coding secfuence, may be varied to optimize the 
fxinction of the regulatory region. 

20 Plasmid pCNDl, which contains the activation cassette, 

is constructed as follows: A 1555 bp (size includes a 9 bp 
synthetic Hindi 1 1 recognition site at the 5' end of oligo 
5.2) fragment is amplified using oligos 5.1 and 5.2. The 
amplified fragment encompasses the CMV IE promoter, CMV IE 

25 exon 1 (non-coding exon) and 827 bp of CMV IE intron 1, 
beginning at nucleotide 172,783 and ending at nucleotide 
174,328 of EMBL sequence X17403 ((Human cytomegalovirus 
strain AD169) . (The source of the CMV IE gene is not 
critical, and CMV IE promoter-based plasmids or wild- type 

30 CMV DNA may be used.) Oligo 5.1 (21 bp, SEQ ID NO; 19) 

hybridizes to the CMV IE promoter at -598 relative to the 
CAP site (EMBLi sequence X17403) . Oligo 5,2 (32 bp, SEQ ID 
NO: 20) contains 23 nucleotides which hybridize to the CMV 
IE promoter at +946 relative to the CAP site, the 

3 5 additional 9 bp at the 5' end of the oligo create a 
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synthetic Hindlll recognition sequence. The 1555 bp PGR 
product is digested with Hindlll and the resultant 1551 bp 
fragment is purified and used in the ligation described 
below. Next, the neomycin phosphotransferase ineo) gene is 
5 isolated from plasmid pBSneo for use as a selectc±)le marker 
for the isolation of stably transfected human cells. The 
neo gene in plasmid pBSneo was obtained by BainHI and Xhol 
digestion of pMClneo-polyA (Thomas, K.R. and Capecchi, M.R. 
Cell 52:503-512 (1987)). Plasmid pMClneo-polyA was 
10 digested with BainHI and made blunt ended with the Klenow 
fragment of coli DNA polymerase I, The resulting DNA 
was digested with Xhol, and the blunt-ended BairiHI-XhoI 
fragment was cloned into Hindi and Xhol digested plasmid 
pBSIISK*. For isolation of the neo gene harbored on 
15 pBSneo, plasmid pBSneo is digested with Xhol and made 
blunt -ended by treatment with the Klenow fragment of E. 
coli DNA polymerase I. The resulting DNA is digested with 
Hindlll and an 1165 bp fragment containing the neo 
expression vuiit is gel purified. The 1165 bp neo fragment 
20 and the 1551 bp CMV promoter fragment are ligated, the 

ligation products are digested with Hindlll and the 2716 bp 
Hindlll fragment, resulting from blunt -end ligation of the 
two fragments, is gel purified. The 2716 bp Hindlll 
product is ligated to Hindlll digested plasmid pBSIISK'*' 
25 (Stratagene Inc., La Jolla, CA) and electroporated into E, 
coli. Colonies containing inserts in the Hindlll site of 
pBSIISK* are analyzed by restriction enzyme analysis to 
confirm the orientation of the insert. One recombinant 
plasmid in which the CMV promoter is oriented such that the 
30 oligo 5.2 sequences ( + 946 relative to the CMV IE CAP site) 
are proximal to the 5a2I recognition sequence in the 
pBSIISK* polylinker, is identified and designated pCNl . 

Oligo 5.1 (SEQ ID NO: 19) 
5' GACATTGATT ATTGACTAGT T 
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Oligo 5.2 (SEQ ID NO: 20) 

5' TTTAAGCTTC TGCAGAAAAG ACCCATGGAA AG 



Next, the dhfr expression unit is inserted at a Clal 
site which is located at the 3' end of the neo gene of 
5 pCNl . The dhfr expression iinit is obtained by EcdRl and 
Sail digestion of plasmid pF8CIS9080 (Eaton et al.. 
Biochemistry 25:8343-8347 (1986)). The resultant 2 kb 
fragment is purified from the digest and made blunt with 
the Klenow fragment of E, coli DNA polymerase I . A Clal 

10 linker (5' CCATCGATGG (NEB 1088; New England Biolabs, 

Beverly, MA) is ligated to the blunt -end dhfr fragment and 
the ligation products are digested with Clal, pCNl is 
digested with Clal, and the Clal dhfr containing fragment 
is ligated into Clal site of pCNl. An aliquot of the 

15 ligation reaction is electroporated into E. coli and 
colonies harboring inserts in a Clal site of pCNl are 
analyzed by restriction enzyme analysis to detearmine the 
site of insertion and the orientation of the insert, A 
plasmid with the dhfr expression unit at the 3' end of the 

20 neo gene and with the same transcriptional orientation as 
that of the neo gene is identified and designated pCNDl . 

Plasmid pDNasel is constructed as follows: Based on 
the restriction map of the upstream region of the DNase I 
gene (Figure 9), a 664 bp BairiHI fragment (-1161 to -498 in 

25 figure 8) can be isolated from subclone pBS/4C, 2Hinc2 , 
This fragment is ligated to BairMl digested plasmid 
pBSIISK'*'dApaI (modification of pBSIISK*; Stratagene Inc., 
La Jolla, CA) in which the Apal recognition sequence in the 
polylinker is destroyed. pBSIISK^dApal is constructed by 

30 digesting pBSIISK* with Apal , conversion of the 

cohesive-ends to blunt-ends with T4 DNA polymerase and 
ligation to generate the circular plasmid. Following 
ligation of the 664 bp BairiHI fragment into pBSIISK'*'dApaI , 
the ligation products are electroporated into E. coli cells 
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to generate pBS-DNasel. The sequences contained in this 
fragment reside upstream of DNase I exon 1, position -1162 
to -498 with respect to the AUG translational initiation 
codon (nucleotide +1) . The activation cassette which 
5 contains the CMV immediate -early (IE) promoter region, the 
CMV IE CAP site, a non- coding exon, an unpaired splice 
donor site, the neomycin phosphotransferase (neo) 
selectable marker gene and dhfr expression unit (to select 
for amplification in targeted human cells) is cloned into 
10 the unique Apal site of the 664 bp BazriHI fragment (DNase I 
upstream region) in pBS-DNasel (see Figure 12) . 
Specifically, plasmid pCNDl which contains the activation 
cassette, is digested with Sctll which cuts downstream of 
the dhfr expression \init and Espl which cuts 242 bp 
15 downstream of the CMV IE CAP site. A 3,955 bp Sall-Espl 
fragment containing the activation cassette is purified 
from this digest and the cohesive -ends are made blunt by 
treatment with the Klenow fragment of E, coli DNA 
polymerase I. This fragment is ligated to plasmid 
20 pBS-DNasel, which has been digested with Apal and made 
blunt-ended by treatment with T4 DNA polymerase I, and 
elect roporated into E. coli. Colonies containing inserts 
of the activation cassette inserted at the blunt -ended Apal 
site of pBS-DNase 1 are analyzed by restriction enzyme 
25 analysis to confirm the orientation of the insert- One 

recombinant plasmid in which the CMV promoter is oriented 
such that the direction of transcription is towards DNase I 
exon 1 is identified and designated pDNasel. 

Plasmid pDNasel is digested with BauiHI for 
30 transfection into human cells. Transfection of primary, 
secondary, or immortalized human cells and isolation of 
homologously recombinant cells expressing DNase I may be 
accomplished using the methods described in U.S. Serial No. 
08/243,391 and incorporated herein by reference. 
3 5 Homologously recombinant cells may be identified by PCR 
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screening strategy as exemplified therein and in piiblished 
methods available to one skilled in the art (see, for 
example, Kim, H-S and Smithies, O., Nucl . Acids Res, 
1^:8887-8903 (1988)). The identification of cells 
5 expressing DNase I may also be accomplished using a variety 
of assays based on the structure or properties of DNase I. 
For example, DNase I may be functionally identified by an 
in vitro enzyme assay (cf . Kunitz, J. Gen, Physiol, 33: 34 9 
(1950); McDonald, Meth. Enzymol . 2:437 (1955)) or by the 
10 use of cUiti-DNase I antibodies in an ELISA assay. 

The isolation of cells containing amplified copies of 
the amplifiable marker gene and the activated DNase I locus 
is performed as described in U.S. Serial No.: 07/985,586 
incorporated herein by reference. 

15 EXAMPLE 6 : Cloning of the Human S-Interferon Gene and 

Identification of the 5" Flanking Sequences 
The human jS-interf eron gene was isolated from a human 
genomic DNA library. The library (Clontech, Palo Alto, CA; 
Cat. #HL1006d) was constructed by cloning Mbol partially 
20 digested male leukocyte DNA into the BauiHI site of the 

bacteriophage lambda vector EMBL3 . For library screening, 
a DNA probe was isolated by PCR amplification of human 
genomic DNA using oligonucleotides 6.1 and 6,2 

Oligo 6,1 (SEQ ID NO: 21) 
25 5' TGCTCTGGCA CAACAGGTAG 

Oligo 6,2 (SEQ ID NO: 22) 
5' CATAGATGGT CAATGCGGC 

These primers were designed based on the published 
IS-interf eron mRNA sequence (May, L,T. and Sehgal, P.B., J. 
30 Interferon Res. 5:521-526 (1985)). The amplified probe 
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(probe A; 290 bp) was labeled with -^^P-dCTP by PGR and used 
to screen a bacteriophage lambda genomic DNA library. The 
filters were hybridized for 16 hours at 68 in 125 mM 
Na2HP04 (pH 7.2), 250 mM NaCl, 10% PEG 8000, 7% SDS, 1 mM 
5 EDTA. Filters were washed two times in 500 ml of 20 mM 
Na2HP04 (pH 7.2), 5% SDS, 1 mM EDTA, followed by 4 washes 
in 500 ml of 20 mM Na2HP04 (pH 7.2), 1% SDS, 1 mM EDTA. 
The wash buffers were preheated to 56 'C stnd washing was 
performed at room temperature on a rotary shaker for 

10 approximately 5 minutes per wash. The hybridization 

signals were visualized by autoradiography at -80*C with an 
intensifying screen. In this experiment, approximately 1 X 
10^ phage were screened and 6 positive signals were 
obtained- Bacteriophage plagues corresponding to the 

15 positive signals were plated at low density and subjected 
to a second round of screening using probe A, Five of the 
phage (designated la, 2a, 2b, 11a, and 12a) gave positive 
hybridization signals following the secondary screening and 
were retained for further analysis. DNA was isolated from 

20 the plaque purified phage following amplification and 

subsequent purification by cesium chloride gradient ultra 
cent rifugat ion (Yamamoto, K.R. et al . , Virolo^ 40:734 
(1970) ) . Library screening, plaque purification of 
recombinant bacteriophage and isolation of bacteriophage 

25 DNA was performed using standard methods (Ausubel et al . , 

Current Protocols in Molecular Biology, Wiley, New York, NY 
(1987) ) . 

Based on restriction enzyme digestion and Southern 
blot analysis using probe A, all five of the phage (la, 2a, 

30 2b, 11a, and 12a) were shown to contain a common Hindlll 
fragment of approximately 10 kb which encompasses the 
entire sequence coding for S-interferon (561 bp) , 666 bp of 
3' untranslated sequence and approximately 9 kb of 
non- transcribed DNA lying upstream of the S-interferon 

35 gene. This fragment was isolated from one genomic clone 
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(la) and sxibcloned into pBSIISK* (Stratagene Inc., La 
Jolla, CA) for further analysis. The resultant clones, 
pBS-H3/Bint.ll-3 and pBS -H3 /Bint . 11-21 , harbor the 10 kb 
Hindlll fragment in opposite orientations with respect to 
5 the pi asmid baclcbone • Restriction enzyme mapping was used 
to generate the restriction map shown in Figure 13, The 
nucleotide sequence of 8,355 bp of DNA lying upstream of 
the previously reported sequence (Genbank entry HUMIFNBIF) 
is shown in Figure 14 (SEQ ID NO: 23) . The nucleotide 

10 sequence corresponding to 356 bp of DNA upstream of the 

S- interferon coding region, the B- interferon coding region, 
and 666 bp of 3 ' untranslated sequence is shown in Figure 
15 (SEQ ID NO: 24) . Comparison of the cloned genomic 
sec[uence presented here, with the published cDNA sequence 

15 (May, L.T. and Sehgal, P.B., Jl Interferon Res. 5:521-526 
(1985)) confirms that the S- interferon gene consists of a 
561 bp coding region which is co-linear with its cognate 
mRNA (lacks introns) . The B- interferon gene encodes a 21 
amino acid signal sequence and a 120 amino acid mature 

20 peptide, beginning with an AUG translational initiation 
codon which lies 82 bp downstream of the CAP site, 

EXAMPLE 7 : Construction of Targeting Plasmids for 

Activation and Amplification of the 
iS- Jnterferon Gene 
25 The activation of the S-interferon gene can be 

accomplished by the strategy outlined in Figure 16* In 
this strategy, a targeting fragment is introduced into the 
genome of recipient cells for replacement of the endogenous 
E- interferon regulatory region with an exogenous regulatory 
region, a non- coding exon, an intron, and chimeric exon 
sequences consisting of sequences from a noncoding exon 
(derived from exon 2 of the CMV IE gene) and sequences from 
the B'interferon 5' noncoding region. Specifically, the 
targeting construct from which this fragment is derived 



30 
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(pIFNS-1) is designed to include a 5' targeting sequence 

homologous to sequences upstream of the JS-interf eron gene, 

a selectsJDle marker gene, an amplifiable marker gene, a 

regulatory region, a CAP site, a non- coding exon, an 

5 intron, chimeric exon sequences consisting of CMV IE exon 2 

sequences and S-interferon 5' noncoding DNA, and a 3' 

targeting sequence homologous to DNA upstream of the 

S-interferon coding region. According to this strategy, 

integration of the targeting construct by homologous 

10 recombination generates recombinant cells producing an mRNA 

precursor which includes the non- coding exon introduced 

upstream of the S-interferon gene, an intron, the chimeric 

exon which fuses CMV IE exon sequences to S-interferon 5' 

noncoding sequences and the entire S-interferon coding 

« 

15 region, and 3' untranslated regions of the S-interferon 

gene (Figure 16) . The chimeric exon consists of 17 bp of 
CMV IE exon 2 (position 172,782 to 172,766 of EMBL sequence 
X174 03) joined to the 5' flanking region of the 
S-interferon gene (position -173 with respect to the AUG 

2 0 translational initiation codon) , Splicing of this 
transcript results in the fusion of the exogenous 
non-coding exon to exon 2 which includes the complete 
coding sequence of the endogenous jS- interferon gene. 
S-interferon is produced by translation of the mature mRNA. 

25 According to this strategy, the 5' targeting sequence is 

upstream of the endogenous target gene and the 3' targeting 
sequence is in the S-interferon S' noncoding region* The 
position of the regulatory region relative to the 5' 
flanking sequence, may be varied (e.g. by altering the size 

30 of the intron in the targeting construct) to optimize the 
function of the regulatory region. 

Plasmid pIFNS-l is constructed as follows: A 182 bp 
fragment (size includes a 9 bp synthetic BamHI recognition 
site at the 5' end of Oligo 7.1) is amplified from 
35 pBS-H3/Bint . 11-3 using oligos 7.1 and 7,2. The amplified 
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fragment serves as the 3' targeting sequence (Figure 16) . 
Oligo 7.1 (21 bp, SEQ ID NO: 25) hybridizes to the 
B'interferon 5' non- transcribed region at position -173 
with respect to the B-interferon AUG translational 
5 initiation codon (Figure 15), Oligo 7.2 (30 bp, SEQ ID NO: 
26) contains 21 nucleotides which hybridize to the 
^-interferon 5' untranslated region at position -1 relative 
to the AUG translational start codon (see Figure 16) , with 
the additional 9 bp at the 5' end of the oligo creating a 

10 synthetic BamHI recognition sequence. The 182 bp PGR 
product is purified and used in the ligation described 
below. Next, a 1571 bp (size includes an 8 bp synthetic 
Smal recognition sequence at the 5' end of oligo 7,3) 
fragment is amplified using oligos 7.3 and 7.4. The 

15 amplified fragment encompasses the CMV IE promoter, CMV IE 
exon 1 (non-coding exon) , CMV IE intron 1 and 17 bp of CMV 
IE exon 2, beginning at nucleotide 174,328 and ending at 
nucleotide 172,766 of EMBL sequence X17403 (Human 
cytomegalovirus strain AD 169) . (The source of the CMV IE 

20 gene is not critical, and CMV IE promoter-based plasmids or 
wild type CMV DNA may be used) . Oligo 7.3 (29 bp, SEQ ID 
NO: 27) contains 21 nucleotides which hybridize to the CMV 
IE promoter at -598 relative to the CAP site (EMBL sequence 
X17403) , the 5' end of the oligo also contains a 8 bp 

25 synthetic Smal recognition sequence. Oligo 7.4 (21 bp, SEQ 
ID NO: 28) hybridizes to the CMV IE promoter at +965 
relative to the CAP site. The 1571 bp PCR product 
containing the CMV IE promoter, CMV IE exon 1, CMV IE 
intron l and 23 bp of CMV IE exon 2, is gel purified and 

30 ligated to the 182 bp fragment containing the &-interferon 
5' flanking region. The ligation products are digested 
with BajriHI and Smal , and the 1742 bp Smal-BairiHI fragment, 
resulting from ligation of S-interferon sequences (position 
-173 with respect to the AUG translational initiation 

35 codon) to CMV IE sequences (-598 relative to the CMV IE CAP 
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site) , is gel purified. The 1742 bp Smal-SamHI fragment is 
ligated to BanMl and Sraal digested plasmid pBSIISK* 
(Stratagene Inc., La Jolla, CA) and electroporated into 
coll. Colonies containing inserts in pBSIISK* are analyzed 
5 by restriction enzyme analysis to confirm the structure of 
the insert. One recombinant plasmid is identified and 
designated pBS-CB. 

Oligo 7.1 (SEQ ID NO: 25) 
5' TGACATAGGA AAACTGAAAG G 

10 Oligo 7.2 (SEQ ID NO: 26) 

5' TTTGGATCCG TTGACAACAC GAACAGTGTC G 

Oligo 7.3 (SEQ ID NO: 27) 

5' TTTCCCGGGA CATTGATTAT TGACTAGTT 

Oligo 7,4 (SEQ ID NO: 28) 
15 5' CGTGTCAAGG ACGGTGACTG C 

The neomycin phosphotransferase (neo) gene is isolated 
from plasmid pBSneo for use as a selectable marker for the 
isolation of stably transf ected human cells . The neo gene 
in plasmid pBSneo was obtained by BajriHI and Xhol digestion 

20 of pMClneo-polyA (Thomas, K.R. and Capecchi, M.R. , Cell 
51:503-512 (1987)). Plasmid pMClneo-polyA was digested 
with Banmi and made blunt ended with the Klenow fragment of 
E. coli DNA polymerase I. The resulting DNA was digested 
with Xhol, and the blunt-ended BainHI-XhoI fragment was 

25 cloned into Hindi and Xhol digested plasmid pBSIISK*. For 
isolation of the neo gene harbored on pBSneo, plasmid 
pBSneo is digested with Xhol and made blunt *ended by 
treatment with the Klenow fragment of E. coli DNA 
polymerase I. The resulting DNA is digested with Hindlll 

30 and a 1165 bp fragment containing the neo expression unit 
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is gel purified- The 1165 bp fragment is ligated to Smal 

and Hindlll digested plasmid pBS-CB and electroporated into 

coli. Colonies containing inserts in pBS-CB are 

analyzed by restriction enzyme analysis to confirm the 

5 orientation of the insert. One recombinant plasmid is 

identified and designated pBS-CBN, 

Next, the dhfr expression unit is inserted at the Clal 

site which is located at the 3' end of the neo gene of 

pBS-CBN. The dhfr expression unit is obtained by EcoRI and 

10 Sail digestion of plasmid pF8CIS9080 (Eaton et al,, 

Biochemistzy 25:8343-8347 (1986)), The resultant 2 kb 

fragment is purified from the digest and made blunt with 

the Klenow fragment of E. coli DNA polymerase I. A Clal 

linker (5' CCATCGATGG; NEB 1088, New England Biolabs, 

« 

15 Beverly, MA) is ligated to the blunt-end dhfr fragment, the 
ligation products are digested with Clal and purified. The 
Clal dhfr containing fragment is ligated into Clal digested 
plasmid pBS-CBN. An aliquot of the ligation reaction is 
electroporated into E. coli and colonies harboring inserts 

20 in a Clal site of pBS-CBN are analyzed by restriction 

enzyme analysis to determine the site of insertion and the 
orientation of the insert. A plasmid with the dhfr 
expression unit at the 3' end of the neo gene and with the 
same transcriptional orientation as that of the neo gene is 

25 identified and designated pBS-CBND. 

Finally, the targeting construct is constructed by 
insertion of the 5' targeting sequence (Figure 16) in the 
unique Sail site located at the 3' end of the dhfr 
expression unit in plasmid pBS-CBND. To obtain the 5' 

30 targeting sequence, the plasmid pBS-H3/Bint . 11-3 is 

digested with EcoRl and PvuII and the resultant 1.2 kb 
fragment is purified, ligated to EcoRl-Sma,l digested 
plasmid pBSIISK"^ (Stratagene Inc., La, Jolla, CA) and 
electroporated into E. coli. Colonies containing inserts 

3 5 in pBSIISK* are analyzed by restriction enzyme analysis. 
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and one plasmid containing the insert is retained and 
designated pBS-BI5. Plasmid pBS-BI5 is digested with 5pel 
and EcoRV and made blxint-ended with the Klenow fragment of 
DNA polymerase I. The resulting 1.2 kb fragment is ligated 
5 to Sail digested plasmid pBS-CBND, which has been made 
blunt -ended with the Klenow fragment of E. coll DNA 
polymerase I. An aliquot of the bl\int-end ligation 
reaction is electroporated into E. coli and colonies 
harboring inserts in the Sail site of pBS-CBND are analyzed 
10 by restriction enzyme analysis to determine the orientation 
of the insert. A plasmid with the EcoKl site at the 3' end 
of the dhfr expression unit is identified and designated 
pIFNfi-1 . 

Plasmid pIFNS-1 is digested with BamHI for 
15 transfection into human cells. ' Transfection of primary, 
secondary, or immortalized human cells and isolation of 
homologously recombinant cells expressing fi- interferon may 
be accomplished using the methods described in U.S. Serial 
No, 08/243,391 and incorporated herein by reference. 
20 Homologously recombinant cells may be identified by PGR 

screening strategy as exemplified therein and in published 
methods available to one skilled in the art (see, for 
example, Kim, H-S and Smithies, O. , Wuci . Acids Res. 
16:8887-8903 (1988)). The identification of cells 

2 5 expressing 15- interferon may also be accomplished using a 

variety of assays based on the structure or properties of 
15- interferon. For example, S-interferon may be identified 
by an in vitro reverse passive hemagglutination assay 
(Accurate Chemical Corp., Westbury, NY), stimulation of 

3 0 superoxide anion production by mouse peritoneal macrophages 

(Colligan, J. E. et al . Current Protocols in Iininxinology, 
Wiley, New York, NY. (1994) , or by using anti-S-interf eron 
antibodies in an ELISA assay. 

The isolation of cells containing amplified copies of 
3 5 the amplifiable marker gene and the activated S-interferon 
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locus is performed as described in U.S, Serial No.: 
07/985,586 incorporated herein by reference. 

Eau i val ent s 

Those skilled in the art will recognize, or be able to 
ascertain using not more than routine experimentation, many 
equivalents to the specific embodiments of the invention 
described herein. Such equivalents are intended to be 
encompassed by the following claims. 
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1. A method for controlling (e.g. altering) the 
expression of a structural gene in a cell 
comprising the steps of : 
5 (a) providing a DNA construct comprising a 

targeting sequence, a regulatory sequence and 
a splice donor site; 
(b) establishing an intervening DNA sequence 
between the regulatory sequence and the 
10 structural gene by inserting the construct 

into the cell by homologous recombination at a 
preselected position relative to the 
structural gene to produce a homologous ly 
recombinant cell in which the inserted 
15 construct adopts a configuration whereby the 

regulatory sequence is separated from the 
structural gene by a preselected length of 
intervening DNA, the splice donor site being 
positioned such that cognate RNA of the 
20 intervening DNA is removed during post- 

transcriptional splicing of the primary 
transcript ; and 
(c) controlling the expression of the structural 
gene by varying the length of the intervening 
25 DNA selected in step (b) . 

2 . A DNA construct for use in the method of Claim 1 
and capable of altering the expression of a gene 
encoding thrombopoietin when inserted by homologous 
recombination into chromosomal DNA of a cell, said 
3 0 construct comprising: 

(a) a targeting sequence comprising DNA which 

hybridizes to genomic DNA within or upstream 
of the thrombopoietin gene; 
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(b) a regulatory sec[uence; 

(c) an exon; and 

(d) an unpaired splice-donor site. 



3 . The DNA construct of Claim 2 wherein the regulatory 
5 sequence comprises a promoter. 

4 . The DNA construct of Claim 2 or Claim 3 further 
comprising a selectable marker gene. 

5. The DNA construct of any one of Claims 2-4 further 
comprising an amplifiable marker gene. 

10 6. The DNA construct of any one of Claims 2-5 further 

comprising a second targeting sequence comprising 
DNA which hybridizes to genomic DNA within or 
upstream of the thrombopoietin gene. 

7. The DNA constiruct of any one of Claims 2-6 wherein 
15 the targeting sequence is selected from the group 

consisting of SEQ ID NO: 3, SEQ ID NO : 4 or 
fragment thereof or a sequence which hybridizes to 
a sequence selected from the group consisting of 
SEQ ID NO: 3, SEQ ID NO: 4 or fragments thereof. 

20 8. The DNA construct of Claim 7 wherein the targeting 

sequence is a fragment of SEQ ID NO: 3 and is at 
least about 20 base pairs. 

9. The DNA construct of Claim 7 wherein the targeting 
sequence is a fragment of SEQ ID NO: 4 and is at 
25 least about 20 base pairs. 



10. The DNA construct of Claim 9 wherein the targeting 
sequence is at least about 2 0 base pairs and is a 



wo 96/294 1 1 PCT/DS96/03377 

-68- 

sequence between about nucleotides -1815 to -145, 
14 to 245, or 374 to 570 of Figure 5 (SEQ ID NO: 
4) . 



11 . An isolated DNA molecule for use as part of the 
5 construct of any one of Claims 2-10 being of at 

least about 20 base pairs and selected from the 
group consisting of SEQ ID NO: 3, a fragment 
thereof, and a sequence which hybridizes to SEQ ID 
NO: 3 - 

10 12 . An isolated DNA molecule for use as part of the 

construct of any one of Claims 2-10 being of at 
least about 20 base pairs and selected from the 
group consisting of a sequence between about 
nucleotides -1815 to -145, 14 to 245, or 374 to 570 

15 of Figure 5 (SEQ ID NO: 4) , and a sequence which 

hybridizes to a sequence between about nucleotides 
-1815 to -145, 14 to 245, or 374 to 570 of Figure 5 
(SEQ ID NO: 4) - 

13 . A method of producing a homologous ly recombinant 
20 cell wherein the expression of the thrombopoietin 

gene is altered, comprising the steps of: 
(a) transfecting a cell containing the 

thrombopoietin gene with the DNA construct of 
one of Claims 2-10; and 
25 (b) maintaining the transfected cell under 

conditions appropriate for homologous 
recombination . 



14. A homologously recombinant cell produced by the 
method of Claim 13 . 



A homologous ly recombinant cell obtainable by the 
method of Claim 1 which expresses thrombopoietin 
comprising an exogenous regulatory region, an 
exogenous exon, and an exogenous unpaired splice- 
donor site operatively linked to an endogenous 
splice acceptor site of the thrombopoietin gene. 

The homologously recombinant cell of Claim 15 
wherein the exogenous regulatory region, the 
exogenous exon, and the exogenous unpaired splice- 
donor site are operatively linked to the endogenous 
splice acceptor site of the second or third exon of 
the thrombopoietin gene. 

A method for producing thrombopoietin comprising 
the steps of maintaining the homologously 
recombinant cell of any one of Claims 14 to 16 
under conditions appropriate for the production of 
thrombopoietin • 

A method for producing thrombopoietin wherein the 
expression of the thrombopoietin gene is altered, 
comprising the steps of: 

(a) transfecting a cell containing the 
thrombopoietin gene with the DNA construct of 
one of Claims 2-10; and 

(b) maintaining the transfected cell under 
conditions appropriate for homologous 
recombination; and 

(c) maintaining the homologously recombinant cell 
produced in step (b) under conditions 
appropriate for the production of 
thrombopoietin . 
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A thxombopoietin produced by the method of Claim 17 
or 18, 

A pharmaceutical composition comprising the 
thrombopoietin of Claim 19. 

A method of providing thrombopoietin to a mammal in 
need thereof comprising administering homologously 
recombinant cells of any one of Claims 14 to 16 in 
sufficient number to produce a therapeutically 
effective amount of thrombopoietin in the mammal . 

A DNA construct for use in the method of Claim 1 
capable of altering the expression of a gene 
encoding DNase I when inserted by homologous 
recombination into chromosomal DNA of a cell, said 
construct comprising : 

(a) a targeting sequence comprising DNA which 
hybridizes to genomic DNA within or upstream 
of the DNase I gene; 

(b) a regulatory sequence; 

(c) an exon; and 

(d) an unpaired splice-donor site. 

The DNA construct of Claim 22 wherein the 
regulatory sequence comprises a promoter. 

The DNA construct of Claim 22 or 23 further 
comprising a selectable marker gene. 

The DNA construct of any one of Claims 22-24 
further comprising an amplifiable marker gene. 

The DNA construct of any one of Claims 22-25 
further comprising a second targeting sequence 
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comprising DNA which hybridizes to genomic DNA 
within or upstream of the DNase I gene. 



27. The DNA construct of any one of Claims 22-26 
wherein the targeting sequence is selected from the 

5 group consisting of SEQ ID NO: 17, SEQ ID NO: 18 or 

fragments thereof or a sequence which hybridizes to 
a sequence selected from the group consisting of 
SEQ ID NO: 17, SEQ ID NO: 18 or fragments thereof. 

28. The DNA construct of Claim 27 wherein the targeting 
10 sequence is a fragment of SEQ ID NO: 17 and is at 

least cdDOUt 20 base pairs. 

29. The DNA construct of Claim 27 wherein the targeting 
secjuence is a fragment of SEQ ID NO: 18 and is at 
least about 20 base pairs. 

15 30. The DNA construct of Claim 29 wherein the targeting 

secjuence is at least about 20 base pairs and is a 
sequence between about nucleotides -32 8 to -2 of 
Figure 11 (SEQ ID NO: 18) . 

31. An isolated DNA molecule for use as part of the 
2 0 construct of any one of Claims 22-30 being of at 

least about 20 base pairs aind selected from the 
group consisting of SEQ ID NO: 17, a fragment 
thereof, and a sequence which hybridizes to SEQ ID 
NO: 17. 



25 32. An isolated DNA molecule for use as part of the 

construct of any one of Claims 22 to 3 0 being of at 
least about 20 base pairs and selected from the 
group consisting of a sequence between about 
nucleotides -328 to -2 of Figure 11 (SEQ ID NO: 18) 
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and a sequence which hybridizes to a sequence 
between about nucleotides -328 to -2 of Figure 11 
(SEQ ID NO: 18) . 

33- A method of producing a homologously recombinant 
5 cell wherein the expression of the DNase I gene is 

altered, comprising the steps of: 
(a) transfecting a cell containing the DNase I 

gene with the DNA construct of one of Claims 
22-3 0; and 

10 (b) maintaining the transfected cell xinder 

conditions appropriate for homologous 
recombination • 

34 . A homologously recombinant* cell produced by the 
method of Claim 33. 

15 35. A homologously recombinant cell obtaincQDle by the 

method of Claim 1 which expresses DNase I 
comprising an exogenous regulatory region, an 
exogenous exon, and an exogenous unpaired splice- 
donor site operatively linked to an endogenous 

20 splice acceptor site of the DNase I gene. 

36. The homologously recombinant cell of Claim 35 
wherein the exogenous regulatory region, the 
exogenous exon, and the exogenous unpaired splice- 
donor site are operatively linked to the endogenous 

25 splice acceptor site of the second exon of the 

DNase I gene . 

37. A method for producing DNase I comprising the steps 
of maintaining the homologously recombinant cell of 
any one of Claims 34 to 3 6 under conditions 

30 appropriate for the production of DNase I. 
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A method for producing DNase I wherein the 
expression of the DNase I gene is altered^ 
comprising the steps of : 

.(a) transfecting a cell containing the DNase I 

gene with the DNA construct of one of Claims 
22-30; and 

(b) maintaining the transfected cell under 
conditions appropriate for homologous 
recombinat ion ; and 

(c) maintaining the homologously recombinant cell 
produced in step (b) under conditions 
appropriate for the production of DNase I, 

A DNase I produced by the method of Claim 37 or 38 . 

A pharmaceutical composition comprising the DNase I 
of Claim 39. 

A method of providing DNase I to a mammal in need 
thereof comprising administering homologously 
recombinant cells of any one of Claims 34 to 3 6 in 
sufficient number to produce a therapeutically 
effective amount of DNase I in the mammal. 

A DNA construct for use in the method of Claim 1 
and capable of altering the expression of a gene 
encoding S- interferon when inserted by homologous 
recombination into chromosomal DNA of a cell, said 
construct comprising : 

(a) a targeting sequence comprising DNA which 
hybridizes to genomic DNA within or upstream 
of the S- interferon gene; 

(b) a regulatory sequence; 

(c) an exon; 

(d) a splice-donor site; 
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(e) 
(f ) 



an intron; and 

a splice-acceptor site. 



The DNA construct of Claim 42 wherein the 
regulatory sequence comprises a promoter. 



5 44 . 



45, 



46 . 

10 

47. 

15 

48 . 

20 



The DNA construct of Claim 42 or 43 further 
comprising a selectable marker gene. 

The DNA construct of any one of Claims 42-44 
further comprising an amplifiable marker gene. 

The DNA construct of any one of Claims 42-45 
further comprising a second targeting sequence 
comprising DNA which hybridizes to genomic DNA 
within or upstream of the fi- interferon gene. 

The DNA construct of Claim 42 wherein the targeting 
sequence is selected from the group consisting of 
SEQ ID NO: 22, SEQ ID NO: 24 or fragments thereof 
or a sequence which hybridizes to a sequence 
selected from the group consisting of SEQ ID NO: 
23, SEQ ID NO: 24 or fragments thereof. 

The DNA construct of Claim 47 wherein the targeting 
sequence is a fragment of SEQ ID NO: 23 and is at 
least about 20 base pairs. 

The DNA construct of Claim 4 7 wherein the targeting 
sequence is a fragment of SEQ ID NO: 24 and is at 
least about 20 base pairs. 



25 50 . 



An isolated DNA molecule for use as part of the 
construct of any one of Claims 42-4 9 being of at 
least about 2 0 base pairs and selected from the 
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group consisting of SEQ ID NO: 23, a fragment 
thereof, and a sequence which hybridizes to SEQ ID 
NO: 23, 

51. A method of producing a homologously recombinant 
5 cell wherein the expression of the 6- interferon 

gene is altered, comprising the steps of: 
(a) transfecting a cell containing the S- 

interferon gene with the DNA construct of one 
of Claims 42-49; and 
10 (b) maintaining the transfected cell under 

conditions appropriate for homologous 
recombination . 

52. A homologously recombinant cell produced by the 
method of Claim 51. 

15 53 . A homologously recombinant cell obtainable by the 

method of Claim 1 which expresses S- interferon 
comprising an exogenous regulatory region, an 
exogenous exon, an exogenous splice-donor site, and 
exogenous intron and an exogenous splice acceptor 

20 site operatively linked to the S-interferon gene. 

54, A method for producing fi- interferon comprising the 
steps of maintaining the homologously recombinant 
cell of Claim 52 or 53 under conditions appropriate 
for the production of S-interferon. 

25 55. A method for producing S-interferon wherein the 

expression of the S-interferon gene is altered, 
comprising the steps of : 

(a) transfecting a cell containing the S- 

interferon gene with the DNA construct of one 
3 0 of Claims 42-4 9; and 
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(b) maintaining the transfected cell under 
conditions appropriate for homologous 
recombination ; and 

(c) maintaining the homologously recombinant cell 
produced in step (b) under conditions 
appropriate for the production of fi- 

interf eron . 

A 6- interferon produced by the method of Claim 54 
or 55 . 

A pharmaceutical composition comprising the S- 
interf eron of Claim 56 . 

A method of providing S- interferon to a mammal in 
need thereof comprising administering homologously 
recombinant cells of Claim 52 or Claim 53 in 
sufficient number to produce a therapeutically 
effective amount of fi- interferon in the mammal. 

The DNA construct of any one of Claims 2-10, 22-30 
or 42-49, isolated DNA of any one of Claims 11-12, 
31-32, or 50, cell of any one of Claims 14-16, 34- 
36 or 52-53, thrombopoietin of Claim 19, DNase of 
Claim 39, S-interferon of Claim 56, or 
pharmaceutical composition of Claims 20, 40 or 57 
for use in therapy, for example in: 

(a) gene therapy; 

(b) providing TPO to a mammal by introducing 
homologously recombinant cells into the mammal 
in a sufficient number to produce an effective 
amount of TPO in the mammal; 

(c) administering homologously recombinant cells 
expressing DNase I to the trachea and lungs of 
a cystic fibrosis patient to effect in vivo 



secretion of DNase I for the relief of 
respiratory distress; 

(d) implanting homologous ly recombinant cells 
expressing fi- interferon into a patient 
suffering from multiple sclerosis to effect in 
vivo secretion of fi- interferon to diminish 
exacerbations associated with the disease; 

(e) the delivery of TPO, IS- interferon or DNase I 
to a patient comprising the steps defined in 
Claim 18, 38 or 55. 

A graft (e.g.. an autograft, allograft or xenograft) 
comprising the DNA contruct of any one of Claims 2- 
10, 22-30 or 42-49, isolated DNA of any one of 
Claims 11-12, 31-32, or 5Cf, cell of any one of 
Claims 14-16, 34-36 or 52-53, thrombopoietin of 
Claim 19, DNase of Claim 39 or fi- interferon of 
Claim 56. 

The graft of Claim 60 for use in therapy, e.g. in 
the therapies recited in Claim 59 (a) to (e) . 

A pharmaceutical composition or device comprising 
the DNA construct of any one of Claims 2-10, 22-30 
or 42-49, isolated DNA of any one of Claims 11-12, 
31-32, or 50, cell of any one of Claims 14-16, 34- 
36 or 52-53, thrombopoietin of Claim 19, DNase of 
Claim 3 9 or fi- interferon of Claim 56, the 
composition or device for example further 
comprising a barrier device, a nebulizer, an 
atomizer or being in a form suitable for delivery 
by oral, intravenous, intramuscular, intranasal, 
antratracheal or subcutaneous routes , 
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Xbal (-6372) 

-6373 -ICTAGAGTCAGGATCGCACnXSAAGCnxriCT^^ 




Apal (-6233) 

-6249 

-6187 GrrcrrcACMcrrGCCA^^ 

-6125 (XCCGCCACACCCCACMIACCIXX^^ 
-6063 




Hindlll (-5985) 

TOGGTrcCAATCCTG 



-6001 CCAGGGOXXXXTTAGGAAGCITAAGAWU^ 
-5939 



GCTGCACCACTOXXrrAGCT^^ 

-5877 

-5815 GACCACGCSGAQGCAATGCAGAG^ 
-5753 



Bam HI (-5667) 

^^^^^^ ^ A. _ _ — _ _ _ 

TC 



5691 ACTGCCATTGGAGTCrroAGAAG^ 



-5629 GGGTGAGGCCGGACTCAGCCAAAAGCAGCCCCTCC^ 



-5567 CGGCAGCGTGACCCCTCCTTGCrcCTTCCCXrri 



TTTCCCTCCSGGCCC 

ICTCACCGCCTGTAGGAGATAGAGAAGCG 
-5505 GAC^AGAGCGCCAGCAC5CGAGACTC^^ 

-5443 CXrAGCGCCACGAAGTCTGGGACGGGAGGA^ 
-5381 TGGCCCAGCCTCAACCACAACCCrGCTCTTCGC^ 



5319 



GTGTGGCCGTCACCAC 
Apal (-5318) 




-5257 C^TACACGAAGGACAGGCCGCTCGGCT^^ 
5195 GTAAGAACACGGGCTTCAC 



lGCTGGCCATGGGAAAGGCCAGTCCGACGCCCCATCCAA^ 



GTGGCC 
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-5133 CGGGACCTACrrATCGTGCX^^ 

-5009 GCnTTOATCrCT^^ 

-4947 TCCCGGAAAAGGCGGGAAACCCAACnx^^^ 
-4885 




-4823 ACT 




•4761 TAGCAAGGCTGCCATGAGAGIT^ 
-4699 <^CAGAGroGGaSATCACrrAAC^^ 



-4637 




VCCG 

-4575 




iCC 

-4513 



raccAc 

-4451 GTCKXnCCAACXXrATAGACCT^ 

-4389 GCGCIXXrCCAGCTCGCGCCGTCnxX^^ 

-4327 ACCGCGCCXriTCTGCCCCCGCCCACCC^^ 
-4265 




-4203 GCGCCCACCTACCCTGCTGCCCGAAOSGGCAGCG^ 



CCCCGG 

'TCTCAGAACGGATGGGCAGCAC 



-4141 «;<3GCKriOT ( 

-4079 GCCGGCGC^CGGGAGGGCHCGGCAT^ 
-4017 



kGGGCCGGGCCGG 

-3955 

GTCCCCGCGAGGGGC 
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NotI (-3885) 

-3893 






-3831 

-3769 GCC<OTlTrATGCCCCGCGCCCGACGaXXX3GC0GGGGGCC^^ 
-3707 

-3645 

-3583 

-3521 






3459 GCCCAGGAAGGGAGCCnX=AGGCTAGGGMGGGCAGAGGCTO^ 
3397 OriGMSCGMXSCCCGGrrrC^ 



.3335 Apal (-3307) 



-3273 TCTCACTGCCTAGCCTCXXrTCCCTAC^^ 

-3211 CAGRACAC^CCTAGCCAGAAACCGGCAGCATrccxrCCTTCT^^ 

-3149 CTCTCATTGTAACOTATCCTCAGGCGCAITCGAC^ 

-3087 CTTCACXO^GGGACCCTCTCXrCTCTCCAGCCXACTCCCAG^ 

-3025 GGTCATcxxrrorcTccxrKnxrix^^ 



-2963 TATCCCAGCACCCTCCTTCCTAATCT 



T»3GAGACATCTCGTCTGGCTGGACGGGAAAATrc< 



-2901 AGGATCTA«K:CACACriCTCAGCAGACATGCCCATCC^^ 

-2839 CCTGAGGAAGTTCTGGGGGACAGGGGGATGATCGGATCAAGGT^ 

-2777 GGAC^GAGACTGTGGGGAGACTTGGGACTGGGAAGAAAGCAAAGGAGCTAGAGCCA 

-2715 AAGGAAAAGGGGGGCCAGCAGGGWGGTATITCCGGGGGAGGTCCAGCAGCTC^^ 
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-2653 GACAGGGAOVCATCGGCCTCGriATTCCTCITCra^ 

Apal (-2568) 

-2591 CGGAGACAGAACAAGCAAAGGAGGGCCCTGGGCACAGAGGTCIG^ 
-2529 CaVCTGGACCCCAGCAGACGAGCACXrrAAGCTCAGGCnT^ 
-2467 ACTCnXX:CCCCXACCTGACC?rcCACTa^CCXn^ 
-2405 ATAAOVGGAGATTTCTCTCATXnXXSGCAATATC^^ 
-2343 AAGATAGGACTCCCTAGGGGATTACAGaUUlGAAAAGC^^ 
-2281 TCAGCAGCAGGTATCaTCnCCAGGGAAAAGAAATTTG^ 
-2219 CAATCTIAAAOAGACCTCTGrOCTTC^ 
-2157 CITCAAAAAACTTCTGCTCCTGTCCC^ 

BamHt (-2094) 
-2095 CX^TCCrcCTCATCCAAATCnTCTCCGTCnxnc^^ 

-2033 CCAGGCAGGGVGCTCCAGGGAAGAtXAAGGCGTCAC^^ 
-IS"^! "I^SCTCCCTlXnxrrcATTGGGCAa 
-1909 GGGGCTGTGCCCCACCGCCACATG 
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-1761 AAATC3GGCTXXX»GCTGGGGGAGGGGCM3GC^ 
-1699 



•1637 <XrrcKTTTCCTCAGGGACrcATCAOT 
1575 




-1513 AACACy^GTAGTAAGATCGACACAGCCCCAATOCCCaiTl^^ 
-1451 




Apal (-1377) 
-1389 TAATaxa«3GAGGGCCawnCATCnTC^ 

-1327 GCAAfSCCTCTTTGCACAACTTOTGAA^ 

-1203 GTGGGCGCCT^^CAAGGTAAGCCXXrrAAanGGGC^ 
-1141 GGCAGCTGGTITCAGGAACXSAAGTCCCAGAACTCm'A 
-1079 GAGTATITCAaJACTTGGAGTCCAGAGAAAAGCTCCAGT^^ 
1017 GGGAAAGAATAGAGGTTAATlTCTCCCATACCGCCTITrAATCCT^ 
-955 GTTACAGCTITGTGCAGTTCCCCTCax:AGCC^ 
-893 CATATTCCGCCOSTTTGCCAGTTCCTCACCCAGGCCCnX^ 
-831 OC^GGCTGAAGCCACAATACTTTCCTTCTCT 

-769 ACCAAGGTTGCTCAGAATTTAAGGCTAATTAAGATATXnXSTCTATACATATC^ 



-707 GCTCTCAGCAGGGGTAGGTGGCACCAAATCCATOrCCGAOTCACTGAGG 



AGTCCTGACAAAA 
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-645 AGGftGACACaVIATG C ' l ' l ' lt. ' i - ll^Ti ' iViTK- I ' llt, I ' ril.n ' il- ' m ' iTi ' l ' mTitl AG 

-582 ACXXSAtmrcaCTCTTATTOXXSUX^^ 

-519 TCCGCCTCCCAGGTACAACXXSATTCTCCTGT^^ 

-456 TGAACCACCACACCCTXXrrAUiTiTmi^TATTTCCT 

-393 AGGCTanX3GCGAACTCCIGACCTCAGGTC»TCC^ 

-330 TTAO^GGCATGAGCCACIGCAC^^ ^'^^^^ 
-267 ATTCAGQGCTTTXXXaGTTCCAGGCTGGTCa^ 
-204 CTGCCAGGC7^'lVi\-TiU-'l A GAAA(/ riU,U ' r ^ 

AUG (1) 

-15 j^aCftCCmyy r ftqft MS &&& as ACC S GrGaWSAACAa^CCrcAGGGGCTAGGGCC 

43 ATATGGAAACATCACAGAAGGGGAGAGAGAAAGGAGACACGCTCCAGGGGGa^^ 
106 GGAACCCATTCTCCCAAAAATAAGGGGTCTGAGGGGTGGA 

EcoRI (178) 

169 CCTCAATGGGAATTCCTXSGAATACCAGCTGACAATGATTTCC^^ 

232 TCTccTCATCTAAoaa TiscK:a££a££zi!CMsm:cE:cmAcrsQ& 
329 cicAerMACTsciicsiGM:m:£fiiQTccTrc^fiGCAs^i^ 

377 AGAACTCCCAACATTATCCCCTTTATCCGCGTAACTGGTAAGACACCCATA(^ 
440 CACCATCACTTCCTCTAACTCCTTCACCCAATGACTATTC^ 

503 GATCACACTCTCTCyvCAAGGATTATTCTTCACAATAC^^ 
566 GfiACT 
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Hindi (-4511) 
-4512 GTCAACCTICACACTAATI«:TTCn^^ 



-4448 GATACCCTATAAAGCAAQGTAACXSrTAATCT 



roAGACCATGAATGGCCTTCAGCAGAGCAGAGT 



-4384 ATCATKOT^riTCAAAATTCAGAAGGAT^ 
-4320 CGTGCAGCmXSAAACCACAGOTCSrcGTC^^ 
-4256 TCCCTX3GCCT(X:ACGTCan«Xn^ 
-4192 AaX?ITCCTTO3AGGCCATITCOTT^^ 
-4128 ATITavax:AAATCAGAGCATCnX3ACCTa5OT 

-4000 GGTGTAAAAACACCTCATCCTCATCTCSUSAAC^^ 



TGGCT 

3936 GCa^aa«:CCATTCTCTCrK3GATGTX;AA^ 



-3872 CCaGGCCTATCICCAGACnXXXa3CCCAGai.TC^^ 



Apal (-3851) 

TTGTACCCCACTCACTCXXrCTCATCT 



-3808 GGGGCTTGGACCTACAGCTCGACAGCACXCATGGA^ 
-3744 CCC^CTTCX^CTTAG«;CGGCACCnX3TTC^^ 
-3680 



SAGGCCCAGGCAG 
-3616 C^GAGCCACCCCACX:AGACCTC3GCAGTCnx^^ 

-3552 •rc«:TCnTACATGGCAGCATIX5A^ 

-3488 gtcctc3gagactccaacaagccacaggctgcagg«x:a<^at^^ 



3424 TGTTCTGGGAATCTATCAGAGGAAC^CATAGAGGCTCCAGACGGTrc 



aaggcccaacagtgatc 



Apal (-3353) 

CCTCACCAAAGCCCGTC 



3360 CCAGACGGGCCCCATGTCAGACXAGGCTXrCTCCAGGGCTCTC 



3296 



CTGAGGGCAGCCACACAGCAGGCAGCACTCGCCAT^ 



GTTCCAGCCT 
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-3232 Tccnxr 




-3168 GCATTTGGrcAAGA«X:AGGAGGGGATGACAGACC^ 

-3104 CACACGTAOKXSGTTCGGCACTTGCTCT^^ 

-3040 GCTCX:ACCAGGCAGTriXnTG(n«^^ 

-2976 GATCMCnXXXriCTCAGATGK^^ 

-2912 av<XCaATACTCCGGTO3CCTTCXr^^ 

-2848 -ITAGAGATrAAAAACAGGGAAGAACCATIXXriG^^ 

-2784 GCAGCCTCaVGGAGTGCnXSGlCTTT^ 

-2720 TCCACCAGTOnGCCAGCCAGAGCXXnCTC^^ 

-2656 GTCCaGGGroGOTrACCTIXXrK^^ 

-2592 TCGCC7.GACOGAGCACTTKXrroACT^^ 
-2528 




■2464 GCCAACCGTTCCAGGCCCTrcTCCCAGGC3GGACX:ACAGATXan«ACG^ 



TCT 

AGGGCTCTCCTTGGA 



■2400 GGGCCA«:ACAGCCCCTrcCAAGTGGGCAAGACCCAG^ 
•2336 AGCCCTGGAACCTCTGAATGTTC^TTTri^ 
2272 




iTCTTA 

2208 




2144 CCTGCAGAAACTCGAGGGCAAGGAGCATCCCCCAACCGCCCGGAGCCTC 



AG 

CAGGAGGCGCAAGGT 



2080 CCTACreACTCCCTGACTTCAGACGl^ 

2016 TTTAAGCAACCAAACTTCTGGTAGTITCACCAGTCTCAGGA^ 
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-1952 AGATTCCAAGAAATGAGTCMCGGGGTXXXJGTOGC^ 

-1888 C3ATTGCTroC3GCTCAGCaCTK3C»^a^^ 

-1824 CGATC<?ICACGCCTCn3ATCCCAGCACTTT^^ 

-1760 GTTreAGACCAGTGTGACCAACATGGTCAAACCCTGTX^ 

-1632 CCCAGGAAGCAGAGGTlXXaunrSAGCaSAGATAGTATTA^^ 

-1568 CAAGATTCOXOTCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACTGAGCA 

-1504 ACCKnGGTCCTCGTAOCXXXSGAGGATTCXXr^ 

-1440 AGAGCAAGACCCCATCTCTACCAAAAAAAITTAAAAATTAGCC^ 

-1376 GTCTTAGCTACTCAGGaM3GCTCAGGAG(X3A<3^ 

-1312 AGCCATGATTTGGCCAClXXACTCCAGCXnrrcasaACAC^ 



-1248 ATAAAAACCOVAAACAAAAGAACCAAGAAAITACTCGACCTXSAGCCTC^ 



rrCAAAAACA 
TTAGCTGCTGCC 



BamHI (-1162) 
-1184 CTC^CCTKlXaCTGGTCACTCGGATCCCroGGCCTA^ 

1120 AGCXTKXrCCACTGCTIXSGCTGGCAATTCCSGGTa^^ 
1056 GGCGCTGGTGCTGCAGGCCCCCACCACTa^^ 
-992 CTGCACCTGATCGCC5ATCAATCAGGAAaX3VGGCGl^ 
-928 CAGCCACCAGGGGGCTCOVTTTCX^CTTTCXy^a^^ 
Apal (-860) 

- 864 TTGCXSGCCCCCAGACAAGAGACAGGGAGACTOyiGCCCAGCCCCACCCTCCCTC^^ 
-800 

-736 CTITGGCCTTCyiTATCAGACATTlTAAAACTAAGTCCA^ 



FIGURE IGC 



wo 96/29411 



PCT/US96/03377 



17/30 



672 CAGCAGCAAGAAACCTtnxXTET 

607 ACACAGAGCCATTGTTTTCTGCACT^^ 



BamHI (-498) 

-542 CTCXrCTGAACTTTTAAAACTCCC^ 

-477 GCCAGGG 



FIGURE lOD 
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CAP site (-469) 

-470 (XTIQ^J^QTI^^ TlMTlM - lMt - n ^AAQCAGCAAA 

-408 AGGAGAAAATTGTCATCAAAGGATATTC 

-346 ACATCAOa^TCATCTCftGGT^ 

-284 GCAGGGAGGGAGGCTTAGAGTCriCAT^^ 

Smal (-220) 
-222 TCCCGGGCGQGTTTTCTGGTGGATGGA^^ 

-160 TTTGGCTTTCTGGACXnT^ 
-98 CCACCAGCCXXrrcCCAGCroGGCTC 

AUG (1) 

-36 CITCTGTTATCnxrrCTGTTC ^ £3fi CI£ 

19 dS SQ£ £2Ce cm QSQ. Q£A CES £^ Q££ CE£ £^ GSXi GCC GIS 

109 MQMS2:CCMTGQC^CT£STQASCTAQMT Si:^ £A£ OS 

154 ^£QC3M:Q^^G£Qasffl!C£^Q^SE£^QM:^£a^ 
199 sec QQS CIS CES GaC CEC AST CSS QCa 

244 ^ h£Q mr SS^ SI^ hS2L ^ ^ SISk ^ ^ ^ 

289 ^SQ MS GAS SGQ cm ITC GTC TAC Cd QAC CAQ STC 

334 TCr SSG G 
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8646 



-8581 GAAAATaVACATAAAAAGTCTGGTTCTOGAAAAGATATATA^ 
-8516 



-8451 GCAAATTTTATAGGOVTTCAAAGOCSTAATAAAAC^ 

-8386 TGATAAGTAAATAGaU^TGAACCAATTCCTTOAAAGACATAATCI^^ 

-8321 TAAACAATCTCSAATAGOCTATATCTATTAAATAAATTCAA^ 

o^^^ EcoRI (-8223) 

-8256 J^SGM^iCAClATGCXXMiATCX^^ 

-8191 GTAIOACTTTCTACAATCTCTTI^GAAC^ 

-8126 CTACSGCXrAGCATTACCTTAATACCaSAACTAGAAAATCACATT^ 

-8061 OATATCTCTCaTCSACAAAGATACAAACAT^^ 

-7996 TCTATCAAAAAATATACACCACAACCAACTAGAATTTATTCCAGATATCTAA^^ 

-7931 TTTCAAAATCAACTAACGTAATTTGTCCCATCAACAGG^ 

■7866 -I^TACavCACAGAAAAAGCATTTCACAAAATTTAACACCCATrcATX^ 



7801 CTAGGAATAGAO^AAAACTIXXrrcSMSCTTGAATOrAC^ 



TTCCTCTCAATTTTGCTATGAACCTCA 
7736 AACTCCTCTTAAAAAATAAACrrTTTrcATITAAAAAGAAAAC^^^ 
7671 CTATCnx:ATTTrAGACCAATCAGCTATCKATA(m'ACXX^ 
7606 TCTTTCTGGCAATGTTCCAGACTACATTTAAAAAATT^ 
7541 AAGAAAAATATGAAAATGCTTTCCCGTGTTAATGCTACTXr^^ 
7476 ACTTTATTTATATITCATTAGTTITTTACXrrAC^^ 
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-7411 ATGOCACATTACATATAATICTCATOICTXX^ 
-7346 TTTTCTTATOTTTGAlX^OriT^ 

-7281 TAAAGTATATTTGTCATCAITTATACTCGGTAAGGGTTTC^ 
-7216 TCTCATCACATCATATCAAC3TTATATACCATCAATAT1^ 
-7151 ATTTXrrciTAlTTAGTGTATATKaVATGATAG^ 
-7086 TAATGATTATTTAGAGTTTCTCTTKATCT^^ 
-7021 TCTAAGACTTCTTTTTAT?ATCTGCATATTAC^ 
-6956 CTGTCATTCTATGGCCTCaCITl^^ 
-6891 TGCAAICTAATTAACAATCTTITCTITGTC^ 
-6826 ACTGAAGTCATGATGGCAlXXrrTCTATATTATTTT^^ 
-6761 TTAGACTTATAATTCACrGGAATTTTTTTC^^ 

-6696 TTTACATATAAATATATTTCCCTGlTTTTXrTAAAAAAGAAAAAGAT^ 



-6631 AATGCCATATTTTTTTCATAGGTCACTTACATATATCAAT^^^ 
• 65 66 -ITTATCAGCCTCACTGTCTATCCCCACACATCTCATCC^^ 
■6501 AACArrCTTTCCCATTITGTTCTACAAGAATA riU - ri^Ti A 
6436 TTTTACSAATGACXTTTCXXJ^GTrAACAAACAGCTT^^ 
6371 ATGTCy^AAAGAAAGTATACCTTCACT^TATTAAGTCTrrTAGTrC^ 



TCTC 



•63 06 CGTTTCTGCATTAACTTAGACATI^TTAATITX^^ 
•6241 ATTCATTTAAATCrKS^CTAACCT^ATTTACAATr^ 



6176 CTTCTITCCCTAGATTTATTTCXaAGTAGATTAT^^ 



PAA 
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-6111 AATtSTAATlTiraCCTITrTATTC^ 

EcoRI (-6032) 
-6046 -I^SOGACCTTGCTGAAriCTAAT^ 

-5981 AGATCATCTXX:ATATAATITITAAAATCTCATAACr^ 
-5916 AAGAAAATATCCAAATGGTCAATAAACATATCAAAACy^ 
-5851 ATGOU^TTAAACTATAATCatflGTATrATOSTACAAC^^ 
-5786 ACAATATCAAAGTTCGCAAGACnCTGATAa^C^ 



AAATIGCTrACAAACATTTCXXSAAGTCA 
-5656 CTATCACXrCAGirACTTCArrcTAGGCATATACCCA^^ 
-5591 AATACAGACAAGGAATTTCATACSGAGCATrAATTATCAl^^ 
-5526 AOTAGAAGGCSLTAAAACATO51«rrATACTTCT 
-5461 AAACTATACACACAAGATAGACGaATTTCGCAGAOiT^^ 
-5396 CAAAGCTCAAAAACACa^CAGAATCTAGAOT^ 
-5331 AAAGTACnxSACGAGAGAGAGGAGAGAC^TAATGAT^ 
-5266 •I^CCCAAATTTCACATGTTAAAACCTAATCCCCAATXX:^ 
-5201 GTCGATAAITAGGTAATCSGAACAAGAGCCCTAACAAAT^ 
5136 GAGCCTCy^GGQACCTIXnTTCCCGC^ 
5071 C^GCAAGCCCTCATCAGACACTGAATCT^ 
5006 CTATAAAAAGAAATGCTIXnTGlTTAAAAGGCAT^ 
4941 CAAGAGACTrAAGAGGGAACAAGAC3GGCCy.lT7.nx^^ 
4876 CAAAGAGTGCAGACCrrTTITATTl^TAACAATlX^^ 
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-4811 -ITTICTATCrrATATTATTGTTT^ 



4746 AltnATTAATCrCTTATGAAAGAGTTK^^ 



TGG 



-4681 CTTCATTTATGTTCTCKXrACTGCTTATGCACC^^ 



"TGTATAATTIATCA 
-4616 TTCTTTCyiTAGGGACCCTCTOCCTrc^^ 

-4551 ACTAAATGTITTATrTCTAGATACATAGTAGlX^^ 
-4486 AATTCGCTCCTATCTCTCAAATT^^ 
-4421 GAAATCnCATTCAAGTTTTACTTTCr^ 

-4356 GAACTGCnt3CAGGGACIGGAA<nAGTTITC^ 
-4291 




-4226 GGGCACTAACCCTTACAATCXaVGATACACACrKX^^ 
-4161 CAGaAGGTTAAATAAATTITCCTC^lTAT^^ 

-4096 TAAAACTTAAAATGATCTATTTAAAAGGAAGAAATTTTA^^ 
-4031 

-3966 

*'""'~*^^-~»•^"**--^va^Jrf^Wl•A^jC-l•AGAGAACAAAGAGAAG 
-3901 GAAAAAAAAAATCCXrrTTTOATITTTC^^ 

-3836 CTTTATTTTCACCCTCCACAGCCATGAGAGCCTCK^ 




rCTCCCTGTT 

-3771 CCAATCACCTCTAACATTTCTIXXXTATT^^ 
-3706 CAAAGACCIXTTTC^ITAAGTCCAAATCCT 
-3641 CCTGACTTTTCCACCCTCAGCCTCCTTQATI^ 
-3576 TCn«CTTlXXnx:CTGCAAATI^^ 
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-3511 AGCTCTCGATATCATGGTATCTATTSTCTIA^ 

-3446 AXTITATATGCTACri»GlCTAAATTC3^^ 

-3381 AGACTGTACACAAAATTTAATTATCTOlTGtfATAAT^ 

-3316 TAl"i"iuviGATATACTATCX:TAAATAAAACaTATrATTAAAAT™ 
-3251 CTTTCAATATGGCTACTAGSVCSCTTTT^^ 

-3186 AATCXX:CKaACCACATavCXrrCACCACAGCCACCTCT 

-3121 GGCACAClXXXTTGCATTAAGGGCAATGaATGCCTI^^ 

-3056 TTTCTTTCAGA<XX7VTCATCACX31TC^ 

-2991 AGACTCXTTCATATTClAa^GGAAAGATCS^ 

-2926 TCTCTATCITICACACATTACACACXXnxr^ 

-2861 GATAATAACCCATCTCAAATGTTTACTATCAGGATTATT^^ 

-2796 AATAAATC^TAACTAGTACTACCGCCACTACTGTTGITTT^^ 

- 2 73 1 AAGCSACCATTTCCGGATGGAGCATAAGAGACCATITGATGTGGGCAG'I^ 

-2666 CACCIXXiAAAGGTCAACTATATACAAGCCTGCAAGTCATTCT 

-2601 GACTCTATAGACTGTCTCCTCTTTCCTGAGAGGGACAC^^ 

-2536 CSCTCCTTGCATTGGCTTTTCTCKrr 

-2471 CAAAACCCCAAGGAATTACTCAAATACTGACATAACAGACATTITW^ 
-2406 TTTTTAATATTCTCAAACTCATTGTTITTAAAATGCATG^ 



-2341 



GGCCTOZAAAGOGAAAGGCAGAGAGAATCaAACCCATAGAGAGGCAGAATAACCAGAAAGGTrcG 



2276 GACTCGTTTATTTTATAATGTAAAITAGTCTATTATGAAACAATAC^^ 
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-2211 TCGAAAATACAAAGAATAAAAGGAGGAAAAAAATCACTCTTTAGIT^ 

-2146 ACTATiaAAATGCrramTACTTCCTTI^ 

-2081 GTATGTACAATTlTATCOTCTAaTTTrcAATATTAAC^^ 

-2016 AAATAATATATGCTCATAATAGAACATTTTAAATtXa^TAAAACAA^ 

-1951 GTAATAlTXATTAAATITrcTCXAASTa^ 

-1886 GCCTAATAACCCrArrrayVGACCTCTTCT^ 

-1821 AAGGTATCAAGTCAAAAGATAAAGArrrTTC^^ 

-1756 CCXX3M3GGTAACTACTA1TAATAGATAGTAATICTAC^^ 

-1691 AGCATCATATgrATACCTTTCTACTAACTTAC^^ 

Pvull (-1580^ 

-1626 TCTATTGCTCTrTTCACTAAATGTATCTGTG^ 

-1561 TCGCTCAATAAlATTCCATCTTGTCCACtJ^ 

-1496 ATTTGTCTTTOTrACrATGATAGTA^^ 

-1431 TACACATGCACATACACATaZATATTTCTtXrACXK^ 

-1366 TGCAAGTTAAAGGAACAATCTCATIXXrTTCAAATTT^^ 

-1301 'KSGTCTCTCCTTGTRACSCTAGTIT^^ 

-1236 TCCTGGCCAAAGAGCAGAGTGCCACAGACCACAACTXXr^ 

-1171 TCTCTTITTTCTATTAATAACT^ 

-1106 GTAAGAGCITATTTTTCTGAACCAGGAAGTGGT^ 

-1041 CTCTTCTCriTAGCTTTTGTGAAAT«5TCAAAAACAT^^ 

-976 ACCCTGGTTOGGCCTTCTCTATCCTTGTC^ 
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TGTCXSUnTrCTCCTCTGTCTAQGAT^^ 
-845 CCTAGACTATTCX:AGTGCCTTra«3AA(^^ 
-779 CXXy\AAACTCX?rTGCaGTTTCAGCT^^ 
-713 CTTTCAGTATCCAAAGAAC»TTGGTrcTAGGA(XAC^ 
-647 AGAGGAlXXrrCAATrCCCTCTTATAAAACGTTXX^ 
-581 GTATATirrAAATCATCCClAGATTACTrATAATAOC^ 
-515 ACACTGTAlCTTTAAAATTTACATTA Tm ' riXJnijViT^ ^ ^^ 
-449 AAATATTTlXXATCTACaGTCAGTAGAATCCACGGATACAGA^ 
-383 GTA'itl'ITl'rA G ' m ' .l ' l ' rm AG &l ' lVi ' m 
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-356 IATTCTC^^' ^^^TL • L\J^Vi • l^^ L.' iTi^ ^ ^ ^ 
-291 TTCACTC3AACTTTAAAAAACATTAGAAAACC^^ 
-226 TATCAT?AGATAGC»GCITAAATAAAGAGTTTTAGAAAC^^ 
-161 ACrGAAAGGGAGAAGTGAAAGTGGGAAATTCCTCTGAATAGAGA^ 

CAP (-81) 

-96 GGCCATACCCACG GASftMgGftCATTCTAACTnrAACC^^ 

AUG (1) 

25 an Qd CEC E2S US 2S3C uc ICC Ad Qd car ace aas asx: lac 
73 fiac us dT QS& HE dA csA aoQ ^ ^ m cas 2S2r cas aas 
121 ocdsasscaaiEsaaiJssassdisaaaacsscasM^ 

Pvull (199) 

169 ME &ac TEC £5ac Mc fid isas fias AH cas as caa cas he cas 
217 aasjsaGGacscescaaasaccaicsMisasaaEaEfiasaacaaEm 
265 esa33:23EAsa£AaGATi£a2daQ£adss£2£^MiisasadasT 
313 siTJsasaacdCdGssdaaisnsiaicaicasaiaaaccaic^ 
361 acasiEdsisaasaaaaadEsasa^saasaiTiEaccasxiQsa aaa 

457 iacdsaa£Q£caa£Ga£iafiadca£2dGc:3G£a£caiasE:asa 
505 iffiiGaaaTCdaas£MC2Tr3BCTicA33:MCAsaaTaQassim: 

Bglll (565) 

553 d£ fisa aac axgA^vgATCTCe^AC^cir^Tr^rTxriv^^^ 

615 TCMCCftr^ftr,aTgCTSTTrMCTGACTGAT(y^AA 
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745 ^ATTI^TK^CAAAACnCAACAT^^ 
810 "'AT^^TOSCCAAGTACCTATIT^^ 
875 ^^CCCIX^CCTTTAAGGAATITAAAAT^ 
940 CAATAAGGGGACCTGAACCTTATGGGG^ 
1005 AAAAGGAAAari«3AGGGTCriXX5AACT^ 

1070 AT^XnCTCATCATAAACHTAGAATIX^^ 
1135 -rroTOTTCTCC^ 

1200 TAATTATGTXXXCCCCACCATCCCTCCAA 
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